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RONUCXEIC ACroS FOR CO^^TORmG DISEASE RESISTANCE TO PLANTS 

5 

The present application is a continuation-in-part application ("CIP") of U.S. 
Patent Application Serial No. ("USSN") 08/781.734, filed January 10, 1997. The 
10 aforementioned application is explicitly incorporated herein by reference in its entirety and 
for all purposes. 

This invention was made witfi Government support under Grant Nos. 92- 
37300-7547 and 95-37300-1571, awarded by the United States Department of Agriculture. 
15 The Government has certain rights in this invention. 

FIELD OF THE INVENTION 

The present invention relates generally to plant molecular biology. In 
particular, it relates to nucleic acids and methods for conferring pest resistance in plants. 
20 panicularly lettuce. 

BACKGROUND OF THE INVENTION 
Recently, several resistance genes have been cloned by several groups from 
several plants. Many of these genes are sequence related. The derived amino acid 
25 sequences of the most common class, RPS2, RPMl (bacterial resistances in Arabidopsis 
(Mindrinos et al. Cell 78:1089-1099 (1994)); Bent et al Science 265:1856-1860 (1994); 
Grant et aL, Science 269:843-846 (1995)), L6 (fungal resistance in flax; Lawrence, et al. 
The Plant Cell 7:1195-1206 (1995)), and (virus resistance in tobacco; Whitham, et al.. 
Cell 78:1101-1115 (1994); and U.S. Patent No. 5,571,706), all contain leucine-rich 
30 repeats (LRR) and nucleotide binding sites (NBS). 

The NBS is a common motif in several mammalian gene families encoding 
signal transduction components (e.g., Ras) and is associated with ATP/GTP-binding sites. 
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The NBS is a common motif in several mammalian gene families encoding 
signal transduction components (e.g., Ras) and is associated with ATP/GTP-binding sites. 

LRR domains can mediate protein-protein interactions and are found in a 
variety of proteins involved in signal transduction, cell adhesion and various other 
functions. LRRs are leucine rich regions often comprising 20-30 amino acid repeats 
where leucine and other aliphatic residues occur periodically. LRRs can function 
extracellularly or intracellularly. 

Since the onset of civilization, plant diseases have had catastrophic effects 
on crops and the well-being of the human population. Plant diseases continue to effect 
enormous human and economic costs. An increasing human population and decreasing 
amounts of arable land make all approaches to preventing and treating plant pathogen 
destruction criiical. The ability to control and enhance a plant's protective responses 
against pathogens would be of enormous benefit. Tissue-specific and temporal control of 
mechanisms responsible for plant cell death would also be of great practical and economic 
value. The present invention fulfills these and other needs. 

What is needed in the art are plant disease resistance genes and means to 
create transgenic disease resistance plants, particularly in lettuce. Further, what is needed 
in the art is a means to DNA fingerprint cultivars and germplasm with respect to their 
disease resistance haplotypes for use in plant breeding programs. The present invention 
provides these and other advantages. 

SUMMARY OF THE INVENTION 
The present mvention provides isolated nucleic acid constructs. These 
constructs comprise an RG (resistance gene) polynucleotide which encodes an RG 
polypeptide having at least 60% sequence identity to an RG polypeptide selected from the 
group consisting of: an RGl polypeptide, an RG2 polypeptide, an RG3 polypeptide, and 
an RG4 polypeptide. RGl, RG2, RG3, RG4, and the like, represent individual **RG 
families." Each "RG family," as defined herein, is a group of polypeptide sequences that 
have at least 60% amino acid sequence identity. Individual members of an RG family, 
i,e.. individual species of the genus, typically map to the same genomic locus. The 
invention provides for constructs comprising nucleotides encoding the RG families of the 
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invention, which can include sequences encoding a leucine rich region (LRR), and/or a 
nucleotide binding site (NBS), or both. 

The invention provides for an isolated nucleic acid construct comprising an 
RG polynucleotide which encodes an RG polypeptide having at least 60% sequence 
5 identity to an RG polypeptide from an RG family selected from the group consisting of: an 
RGl polypeptide, an RG2 polypeptide, an RG3 polypeptide, an RG4 polypeptide, an RG5 
polypeptide, and an RG7 polypeptide. In alternative embodiments, the nucleic acid 
construct comprises an RG polynucleotide which encodes an RG polypeptide comprising 
an leucine rich region (LRR), or, an RG polypeptide comprising a nucleotide bmding site 

10 (NBS). The nucleic acid construct can comprise a polynucleotide which is a fiiU length 
gene. In another embodiment, the nucleic acid construct encodes a ftision protein. 

In one embodiment, the nucleic acid construct comprises a sequence 
encoding an RGl polypeptide. The RGl polypeptide can be encoded by a polynucleotide 
sequence selected from the group consisting of SEQ ID N0:1 (RGl A), SEQ ID NO:2 and 

15 SEQ ID NO: 137 (RGIB), SEQ ID NO: 3 (RGIC), SEQ ID N0:4 (RGID), SEQ ID N0:5 
(RGIE), SEQ ID N0:6 (RGIF), SEQ ID N0:7 (RGIG), SEQ ID N0:8 (RGIH), SEQ ID 
NO:9 (RGII), and SEQ ID NO: 10 (RGl J). 

In another embodiment, the nucleic acid construct comprises a sequence 
encoding an RG2 polypeptide. The RG2 polypeptide can be encoded by a polynucleotide 

20 sequence selected from the group consisting of: SEQ ID N0:21 and SEQ ID NO:27 

(RG2A); SEQ ID NO:23 and SEQ ID N0:28 (RG2B); SEQ ID N0:29 (RG2C); SEQ ID 
NO:30 (RG2D); SEQ ID N0:31 (RG2E); SEQ ID NO:32 (RG2F); SEQ ID NO:33 
(RG2G); SEQ ID NO:34 (RG2H); SEQ ID NO:35 (RG2I); SEQ ID NO:36 (RG2J); SEQ 
ID NO:37 (RG2K); SEQ ID NO:38 (RG2L); SEQ ID NO:39 (RG2M); SEQ ID NO:87 

25 (RG2A); SEQ ID NO:89 (RG2B); SEQ ID N0:91 (RG2C); SEQ ID NO:93 (RG2D) and 
SEQ ID NO:94 (RG2D); SEQ ID NO:96 (RG2E); SEQ ID NO:98 (RG2F); SEQ ID 
NO: 100 (RG2G); SEQ ID NO: 102 (RG2H); SEQ ID NO: 104 (RG2I); SEQ ID NO: 106 
(RG2J) and SEQ ID NO: 107 (RG2J); SEQ ID NO: 109 (RG2K) and (SEQ ID NO: 110 
(RG2K); SEQ ID NO: 112 (RG2L); SEQ ID NO: 114 (RG2M); SEQ ID NO: 116 (RG2N); 

30 SEQ ID NO: 1 18 (RG20); SEQ ID NO: 120 (RG2P); SEQ ID NO: 122 (RG2Q); SEQ ID 
NO: 124 (RG2S); SEQ ID NO: 126 (RG2T); SEQ ID NO: 128 (RG2U); SEQ ID NO: 130 
(RG2V); and, SEQ ID NO: 132 (RG2W). 
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4 

In Other embodiments, the nucleic acid construct comprises a RG3 sequence • 
(SEQ ID NO:68) encoding an RGB polypeptide (SEQ ID NO:138) (RG3). In other 
embodunents. the nucleic acid consttuct comprises an RG4 sequence (SEQ ID NO:69) 
encoding an RG4 polypeptide ( SEQ ID NO:139) (RG4). 
5 In other embodiments, the nucleic acid construct comprises a RG5 sequence 

( SEQ ID NO:134) encodmg an RG5 polypeptide ( SEQ ID NO:135). The RG5 
polypeptide can be encoded by a polynucleotide sequence as set forth in SEQ ID NO:134. 

The invention also provides for a nucleic acid construct which comprises an 
RG7 sequence encoding an RG7 polypeptide. The RG7 polypeptide can be encoded by a 
10 polynucleotide sequence as set forth in SEQ ID NO: 136. 

In further embodiments, the nucleic acid construct can further comprise a 
promoter operably linked to the RG polynucleotide. In alternative embodiments, the 
promoter can be a plant promoter; a disease resistance promoter; a lettuce promoter; a 
consrimtive promoter; an inducible promoter; or, a tissue-specific promoter. The nucleic 
15 acid construct can comprise a promoter sequence from an RG gene linked to a 
heterologous polynucleotide. 

The invention also provides for a transgenic plant comprising a recombinant 
expression cassette comprising a promoter operably linked to an RG polynucleotide. The 
expression cassette can comprise a plant promoter or a viral promoter; the plant promoter 
20 can be a heterologous promoter. In one embodunent, the transgenic plant is lettuce. In 
alternative embodunents, the transgenic plant comprises an expression cassette which 
includes an RG polynucleotide selected from the group consisting of SEQ ID NO:l 
(RGIA); SEQ ID N0:2 and SEQ ID NO: 137 (RGIB); SEQ ID NO: 3 (RGIC); SEQ ID 
N0:4 (RGID); SEQ ID N0:5 (RGIE); SEQ ID N0:6 (RGIF); SEQ ID NO:7 (RGIG); 
25 SEQ ID N0:8 (RGIH); SEQ ID N0:9 (RGII) and SEQ ID NO: 10 (RGIJ); SEQ ID 

NO:21 and SEQ ID NO:27 (RG2A); SEQ ID NO:23 and SEQ ID NO:28 (RG2B); SEQ ID 
NO:29 (RG2C); SEQ ID NO:30 (RG2D); SEQ ID NO:31 (RG2E); SEQ ID NO:32 
(RG2F); SEQ ID NO:33 (RG2G); SEQ ID NO:34 (RG2H); SEQ ID NO:35 (RG2I); SEQ 
ID NO:36 (RG2J); SEQ ID NO:37 (RG2K); SEQ ID NO:38 (RG2L); SEQ ID NO:39 
30 (RG2M); SEQ ID NO:87 (RG2A); SEQ ID NO:89 (RG2B); SEQ ID N0:91 (RG2C); SEQ 
ID NO:93 (RG2D) and SEQ ID NO:94 (RG2D); SEQ ID NO:96 ( RG2E); SEQ ID 
NO:98 (RG2F); SEQ ID NO:100 (RG2G); SEQ ID NO:102 (RG2H); SEQ ID NO:104 



wo 98/30083 ^ PCT/US98/00615 

(RG2I); SEQ ID NO:106 (RG2J) and SEQ ID N0:107 (RG2J); SEQ ID NO:109 (RG2K) . 
and (SEQ ID NO: 110 (RG2K); SEQ ID NO: 112 (RG2L); SEQ ID NO: 114 (RG2M); SEQ 
ID NO:116 (RG2N); SEQ ID N0:118 (RG20); SEQ ID NO:120 (RG2P); SEQ ID 
NO:122 (RG2Q); SEQ ID NO:124 (RG2S); SEQ ID NO:126 (RG2T); SEQ ID NO:128 

5 (RG2U); SEQ ID NO:130. (RG2V); and, SEQ ID NO:132 (RG2W); SEQ ID NO:68 
(RG3); SEQ ID NO:69 (RG4); SEQ ID NO: 134 (RG5); or SEQ ID NO: 136 (RG7). 

The invention provide for a transgenic plant comprising an expression 
cassette comprising an RG polynucleotide which can encode an RGl polypeptide selected 
from the group consisting of SEQ ID N0:11 (RGIA). SEQ ID NO:12 (RGIB), SEQ ID 

10 NO: 13 (RGIC), SEQ ID NO: 14 (RGID), SEQ ID NO: 15 (RGIE), SEQ ID NO: 16 

(RGIF), SEQ ID N0:17 (RGIG), SEQ ID N0:18 (RGIH), SEQ ID N0:19 (RGII), or 
SEQ ID NO:20 (RGIJ); or, an RG2 polypeptide selected from the group consisting of SEQ 
ID NO:22 and SEQ ID N0:41 (RG2A); SEQ ID NO:24 and SEQ ID NO:42 (RG2B); SEQ 
ID NO:43 (RG2C); SEQ ID NO:44 (RG2D); SEQ ID NO:45 (RG2E); SEQ ID NO:46 

15 (RG2F); SEQ ID NO:47 (RG2G); SEQ ID NO:48 (RG2H); SEQ ID NO:49 (RG2I); SEQ 
ID NO:50 (RG2J); SEQ ID N0:51 (RG2K); SEQ ID NO:52 (RG2L); SEQ ID NO:53 
(RG2M); SEQ ID NO:88 (RG2A); SEQ ID NO:90 (RG2B); SEQ ID NO:92 (RG2C); SEQ 
ID KO:95 (RG2D); SEQ ID NO:97 ( RG2E); SEQ ID NO:99 (RG2F); SEQ ID NO:101 
(RG2G); SEQ ID NO: 103 (RG2H); SEQ ID NO: 105 (RG2I); SEQ ID NO: 108 (RG2J); 

20 SEQ ID NO: 111 (RG2K); SEQ ID NO: 1 13 (RG2L); SEQ ID NO: 1 15 (RG2M); SEQ ID 
N0:117 (RG2N); SEQ ID N0:119 (RG20); SEQ ID N0:121 (RG2P); SEQ ID NO:123 
(RG2Q); SEQ ID NO: 125 (RG2S); SEQ ID NO: 127 (RG2T); SEQ ID NO: 129 (RG2U); 
SEQ ID NO: 131 (RG2V); and, SEQ ID NO: 133 (RG2W); an RG4 polypeptide as set forth 
by SEQ ID NO:72; an RG5 polypeptide with a sequence as set forth by SEQ ID NO: 135; 

25 or, an RG7 polypeptide. 

The invention also provides for a method of enhancing disease resistance in 
a plant, the method comprising introducing into the plant a recombinant expression cassette 
conq)rising a promoter functional in the plant and operably linked to an RG polynucleotide 
sequence. In diis method, the plant can be a lettuce plant; and, the RG polynucleotide can 

30 encode an RG polypeptide selected from the group consisting of an RGl polypeptide 

selected from die group consisting of SEQ ID NO: 11 (RGIA), SEQ ID NO: 12 (RGIB), 
SEQ ID NO: 13 (RGIC), SEQ ID NO: 14 (RGID), SEQ ID NO: 15 (RGIE), SEQ ID 
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N0:16 (RGIF), SEQ ID N0:17 (RGIG), SEQ ID N0:18 (RGIH), SEQ ID N0:19 
(RGII). or SEQ ID N0:20 (RGIJ); or, an RG2 polypeptide selected from the group 
consisting of SEQ ID NO:22 and SEQ ID N0:41 (RG2A); SEQ ID NO:24 and SEQ ID 
NO:42 (RG2B); SEQ ID NO:43 (RG2C); SEQ ID NO:44 (RG2D); SEQ ID NO:45 
5 (RG2E); SEQ ID NO:46 (RG2F); SEQ ID NO:47 (RG2G); SEQ ID NO:48 (RG2H); SEQ 
ID NO:49 (RG2I); SEQ ID NO:50 (RG2J); SEQ ID N0:51 (RG2K); SEQ ID NO:52 
(RG2L); SEQ ID NO:53 (RG2M); SEQ ID NO:72; SEQ ID NO:74; SEQ ID NO:88 
(RG2A); SEQ ID NO:90 (RG2B); SEQ ID NO:92 (RG2C); SEQ ID NO:95 (RG2D); 
SEQ ID NO:97 ( RG2E); SEQ ID NO:99 (RG2F); SEQ ID NO: 101 (RG2G); SEQ ID 
10 NO:103 (RG2H); SEQ ID NO:105 (RG2I); SEQ ID NO:108 (RG2J); SEQ ID N0:111 

(RG2K); SEQ ID N0:113 (RG2L); SEQ ID N0:115 (RG2M); SEQ ID N0:117 (RG2N); 
SEQ ID NO: 119 (RG20); SEQ ID NO: 121 (RG2P); SEQ ID NO:123 (RG2Q); SEQ ID 
NO: 125 (RG2S); SEQ ID NO:127 (RG2T); SEQ ID NO:129 (RG2U); SEQ ID N0:131 
(RG2V); and, SEQ ID NO: 133 (RG2W). In this method, the promoter can be a plant 
15 disease resistance promoter, a tissue-specific promoter, a constitutive promoter, or an 
inducible promoter. 

The mvention also provides for a method of detecting RG resistance genes 
in a nucleic acid sample, the method comprising: contacting the nucleic acid sample with 
an RG polynucleotide to form a hybridization complex; and, wherein the formation of the 
20 hybridization complex is used to detect the RG resistance gene in the nucleic acid sample. 
In this method, the RG polynucleotide can be an RGl polynucleotide, an RG2 
polynucleotide, an RG3 polynucleotide, an RG4 polynucleotide, an RG5 polynucleotide or 
an RG7 polynucleotide. In this method, the RG resistance gene can be amplified prior to 
the siep of contacting the nucleic acid sample with the RG polynucleotide, and, the RG 
25 resistance gene can be amplified by the polymerase chain reaction. In one embodiment, 
the RG polynucleotide is labeled. 

The invention further provides for an RG polypeptide having at least 60% 
sequence identity to a polypeptide selected from the group consisting of: an RGl 
pol>Tpeptide, an RG2 polypeptide, an RG3 polypeptide, an RG4 polypeptide, an RG5 
30 poKpeptide, and an RG7 polypeptide. 
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A further understanding of the nature and advantages of the present 
mvention may be realized by reference to the remaining portions of the specification, the 
figures and claims. 

All publications, patents and patent applications cited herein are hereby 
expressly incorporated by reference for all purposes. 

DETAILED DESCRIPTION OF THE INVENTION , 
This invention relates to families of RG genes, particularly from Lacmca 
sativa. Nucleic acid sequences of the present mvention can be used to confer resistance in 
plants to a variety of pests including viruses, fungi, nematodes, insects, aind bacteria. 
Sequences from within the RG genes can be used to fingerprint cultivars or germplasm for 
the presence of desired resistance genes. Promoters of RG genes can be used to drive 
heterologous gene expression under conditions in which RG genes are expressed. Further, 
the present invention provides RG proteins and antibodies specifically reactive to RG 
proteins. Antibodies to RG proteins can be used to detect the type and amount of RG 
protein expressed in a plant sample. 

The present invention has use over a broad range of types of plants, 
including species from the genera Cucurbita, Rosa, Vitis, Juglans, Fragaria, Lotus, 
Medicago, Onobrychis, Trifolium, Trigonella, Vigna, Citrus, Linum, Geranium, Manihot, 
Daucus, Arabidopsis, Brassica, Raphanus, Sinapis, Atropa, Capsicum, Datura, 
Hyoscyamus, Lycopersicon, Nicotiana, Solarium, Petunia, Digitalis, Majorana, 
Ciahorium, Helianthus, Lactuca, Bromus, Asparagus, Antirrhinum, Heterocallis, Nemesis, 
Pelargonium, Panieum, Pennisetum, Ranunculus, Senecio, Salpiglossis, Cucumis, 
Browaalia, Glycine, Pisum, Phaseolus, Lolium, Oryza, Tea, Avena, Hordeum, Secale, 
Triticum, and. Sorghum, In particularly preferred embodiments, species from the family 
Compositae and in particular the genus Lactuca are employed such as L. sativa and such 
subspecies as crispa, longifolia, and asparagina. 

The nucleic acids of the present invention can be used in marker-aided 
selection. Marker-aided selection does not require the complete sequence of the gene or 
precise knowledge of which sequence confers which specificity. Instead, partial sequences 
can be used as hybridization probes or as the basis for oligonucleotide primers to amplify 
nucleic acid, e.g., by PGR. Partial sequences can be used in other methods, such as to 
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follow the segregation of chromosome segments containing resistance genes in plants. 
Because the RG marker is the gene itself, there can be negligible recombination between 
the marker and the resistance phenotype. Thus, RG polynucleotides of the present 
inveiuion provide an optimal means to DNA fingerprint cultivars and wild germplasm with 
respect to their disease resistance haplotypes. This can be used to indicate which 
gennplasm accessions and cultivars carry the same resistance genes. At present, selection 
of plants (e.g., lettuce) for resistance to some diseases is slow and difficult. But linked 
markers allow indirect selection for such resistance genes. Moreover. RG markers also 
allow resistance genes to be identified and combined in a manner that would not otherwise 
be possible. Numerous accessions have been identified that provide resistance to all 
isolates of downy mildew {Bremia lactucae). However, without molecular markers it is 
impossible to combine such resistances from different sources. The nucleic acid sequences 
of the invention provide for a fast and convenient means to identify and combine 
resistances from different sources. The RG markers of the invention can also be used to 
identify recombinants that have new combinations of resistance genes in cis on the same 
chromosome. 

In addition, RG markers may allow the identification of the Mendelian 
factors determining traits, such as field resistance to downy mildew. Once such markers 
have been identified, they will greatly increase the ease with which field resistance can be 
transferred between lines and combined with other resistances. 

In another application, primers to RG sequences can be also designed to 
amplify sequences that are conserved in multiple RG family members. This gives genetic 
information on multiple RG family members. Alternatively, one or more primers can be 
made to sequences unique to a single resistance gene genus or a single RG specie. This 
allows an analysis of individual family groups (an RG genus) or an individual family 
member (a specie). Primers made to individual RGs at the edge of each cluster can be 
used to select for recombinants within the cluster. This minimizes the amount of linkage 
drag during introgression. Classical and molecular genetics has shown that pest resistance 
genes tend to be clustered in the genome. Pest resistance loci comprise arrays of genes 
and exhibit a variety of complex haplotypes rather than being simple alternate allelic 
forms. Pest resistance is conferred by families, or genuses, of related RG sequences, 
indi\idual members, or species, of which have evolved to have a different specificity. 
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Oligonucleotide primers can be designed that amplify members from multiple haplotypes, . 
or genuses, or amplify only members of one genus, or only amplify an individual specie. 
This will provide codominant information and allow heterozygotes to be distinguished 
from homozygotes. 

S Further, comparison of RG sequences will allow a determination of which 

sequences are critical for resistance and will ultimately lead to engineering resistance genes 
with new specificities. Resistance gene sequences were not previously available for 
lettuce. Marker-aided selection will greatly increase the precision and speed of breeding 
for disease resistance. Transgenic approaches will allow pyramiding of resistance genes 
10 into a single Mendelian imit, transfer between sexually-incompatible species, substitute for 
conventional backcrossing procedures, and allow expression of other genes in parallel with 
resistance genes. 

The RG polynucleotides also have utility in the construction of disease 
resistant transgenic plants. This avoids lengthy and sometimes difficult backcrossing 

15 programs currently necessary for introgression of resistance. It is also possible to transfer 
resistance polynucleotides between sexually-incompatible species, thereby greatly 
increasing the germplasm pool that can be used as a source of resistance genes. Cloning of 
multiple RG sequences in a single cassette will allow pyramiding of genes for resistance 
against multiple isolates of a single pathogen such as downy mildew or against multiple 

20 pathogens. Once introduced, such a cassette can be manipulated by classical breeding 
methods as a single Mendelian unit. 

Transgenic plants of the present invention can also be constructed using an 
RG promoter. The promoter sequences from RG sequences of the invention can be used 
with RG genes or heterologous genes. Thus, RG promoters can be used to express a 

25 variety of genes in the same temporal and spatial patterns and at shnilar levels to resistance 
genes. 

Nucleic acids of the Inventi on and Their Preparation 

RG Polynucleotide Families 
30 The present invention provides isolated nucleic acid constructs which 

comprise an RG polynucleotide. In alternative embodiments, the RG polynucleotide is at 
least 18 nucleotides in length, typically at least 20, 25, or 30 nucleotides in length, more 
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typically at least 100 nucleotides in length, generally at least 200 nucleotides in length, 
preferably at least 300 nucleotides in length, more preferably at least 400 nucleotides in 
length, and most preferably at least 500 nucleotides in length. 

In particularly preferred embodiments, the RG polynucleotide encodes a RG 
protein which confers resistance to plant pests. This RG protein can be longer, equivalent, 
or shorter than the RG protein encoded by an RG gene. In various embodiments, an RG 
polynucleotide can hybridize under stringent conditions to members of an RG family (an 
RG genus); e.g., it can hybridize to a member of the RGl RG family, such as an RGl 
polynucleotide selected from the group consistmg of: SEQ ID N0:1 (RGl A); SEQ ID 
N0:2 and SEQ ID NO: 137 (RGIB); SEQ ID NO: 3 (RGIC); SEQ ID N0:4 (RGID); SEQ 
ID NO:5 (RGIE); SEQ ID N0:6 (RGIF); SEQ ID N0:7 (RGIG); SEQ ID N0:8 (RGIH); 
SEQ ID N0:9 (RGII) and SEQ ID NO: 10 (RGIJ). 

In other embodiments, the polynucleotide can also hybridize under stringent 
conditions to a member of the RG2 family; such as an RG2 polynucleotide selected from 
the group consisting of: SEQ ID N0:21 and SEQ ID NO:27 (RG2A); SEQ ID NO:23 and 
SEQ ID NO:28 (RG2B); SEQ ID NO:29 (RG2C); SEQ ID NO:30 (RG2D); SEQ ID 
N0:31 (RG2E); SEQ ID NO:32 (RG2F); SEQ ID NO:33 (RG2G); SEQ ID NO:34 
(RG2H); SEQ ID NO:35 (RG2I); SEQ ID NO:36 (RG2J); SEQ ID NO:37 (RG2K); SEQ 
ID NO:38 (RG2L); SEQ ID NO:39 (RG2M); SEQ ID NO:87 (RG2A); SEQ ID NO:89 
(RG2B); SEQ ID N0:91 (RG2C); SEQ ID NO:93 (RG2D) and SEQ ID NO:94 (RG2D); 
SEQ ID NO:96 (RG2E); SEQ ID NO:98 (RG2F); SEQ ID NO: 100 (RG2G); SEQ ID 
NO: 102 (RG2H); SEQ ID NO: 104 (RG2I); SEQ ID NO: 106 (RG2J) and SEQ ID NO: 107 
(RG2J); SEQ ID NO:109 (RG2K) and (SEQ ID NO:110 (RG2K); SEQ ID N0:112 
(RG2L); SEQ ID N0:114 (RG2M); SEQ ID N0:116 (RG2N); SEQ ID N0:118 (RG20); 
SEQ ID NO:120 (RG2P); SEQ ID NO:122 (RG2Q); SEQ ID NO:124 (RG2S); SEQ ID 
NO: 126 (RG2T); SEQ ID NO: 128 (RG2U); SEQ ID NO: 130 (RG2V); and, SEQ ID 
NO:132(RG2W). 

' In alternative embodiments, each RG2 gene can also iiKlude an ACIS 
sequence which hybridizes under stringent conditions to a polynucleotide selected from the 
group consisting of: SEQ ID NO:56 (AC15-2A); SEQ ID NO:57 (AC15-2B); SEQ ID 
NO:58 (AC15-2C); SEQ ID NO:59 (AC15-2D); SEQ ID NO:60 (AC15-2E); SEQ ID 
N0:61 (AC15-2G); SEQ ID NO:62 (AC15-2H); SEQ ID NO:63 (AC15-2D; SEQ ID 
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NO:64 (AC15-2J); SEQ ID NO:65 (AC15-2L); SEQ ID NO:66 (AC15-2N); SEQ ID 
NO:67 (AC15-20). 

In other embodiments, an RG polynucleotide can hybridize under stringent 
conditions to an RG3 (SEQ ID NO:68), an RG4 (SEQ ID NO:69), and RG5 (SEQ ID 
5 NO: 135), and an RG7 (SEQ ID NO: 137), RG famUy member. 

The present invention further provides nucleic acid constructs which 
comprise an RG polynucleotide which encodes RG polypq>tides from various RG families; 
such as an RG polypeptide having at least 60% sequence identity to an RG polypeptide 
selected from the group consisting of: an RGl polypeptide, an RG2 polypeptide, an RG3 

10 polypeptide, and RG4 polypeptide, and RG5 polypeptide, and an RG7 polypeptide. 

Exemplary RGl polypeptides have the sequences shown in SEQ ID N0:2 
(RGIA), SEQ ID N0:4 (RGIB), SEQ ID N0:6 (RGIC), SEQ ID N0:8 (RGID), SEQ ID 
NO: 10 (RGIE), SEQ ID NO: 12 (RGIF), SEQ ID NO: 14 (RGIG), SEQ ID NO: 16 
(RGIH). SEQ ID NO:20 (RGIJ). Exemplary RG2 polypeptides have the sequences shown 

15 in SEQ ID NO:22 and SEQ ID N0:41 (RG2A); SEQ ID NO:24 and SEQ ID NO:42 

(RG2B); SEQ ID NO:43 (RG2C); SEQ ID N0:44 (RG2D); SEQ ID NO:45 (RG2E); SEQ 
ID NO:46 (RG2F); SEQ ID NO:47 (RG2G); SEQ ID NO:48 (RG2H); SEQ ID NO:49 
(RG2I); SEQ ID NO:50 (RG2J); SEQ ID NO:51 (RG2K); SEQ ID NO:52 (RG2L); SEQ 
ID NO:53 (RG2M); SEQ ID NO:88 (RG2A); SEQ ID NO:90 (RG2B); SEQ ID NO:92 

20 (RG2C); SEQ ID NO:95 (RG2D); SEQ ID NO:97 (RG2E); SEQ ID NO:99 (RG2F); SEQ 
ID NOrlOl (RG2G); SEQ ID NO:103 (RG2H); SEQ ID NO:105 (RG2I); SEQ ID NO:108 
(RG2J); SEQ ID N0:1 1 1 (RG2K); SEQ ID N0:1 13 (RG2L); SEQ ID NO:l 15 (RG2M); 
SEQ ID NO: 11 7 (RG2N); SEQ ID N0:1 19 (RG20); SEQ ID NO: 121 (RG2P); SEQ ID 
NO: 123 (RG2Q); SEQ ID NO: 125 (RG2S); SEQ ID NO:127 (RG2T); SEQ ID NO:129 

25 (RG2U); SEQ ID N0:131 (RG2V); and, SEQ ID NO:133 (RG2W). 

An exemplary RG3 polypeptide has the sequence shown in SEQ ID 
NO: 138. An exemplary RG4 polypeptide has the sequence shown in SEQ ID NO: 139. 
RG polynucleotides will have at least 60% identity, more typically at least 65% identity, 
generally at least 70% identity, and preferably at least 75% identity, more preferably at 

30 least 80% identity, and most preferably at least 85%, 90%. or 95% identity at the deduced 
amino acid level. The regions where substantial identity is assessed can be inclusive or 
exclusive of the nucleotide binding site or the leucine rich region. 
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Vectors and Transcriptional Control Elements 
The invention, providing metliods and reagents for making novel species 
and genuses of RG nucleic acids described herein, further provides methods and reagents 
for expressing these nucleic acids using novel expression cassettes, vectors, transgenic 

5 plants and animals, using constitutive and inducible transcriptional and translational cis- 
{e.g., promoters and enhancers) and fran^-acting control elements. 

The expression of natural, recombinant or synthetic plant disease resistance 
polypeptide-encoding or other (i.e., antisense, ribozyme) nucleic acids can be achieved by 
operably linking the coding region a promoter (that can be plant-specific or not, 

10 constitutive or inducible), incorporating the construct into an expression cassette (such as 
an expression vector), and introducing the resultant construct into an in vitro reaction 
system or a suitable host cell or organism. Synthetic procedures may also be used. 
Typical expression systems contain, in addition to coding or antisense sequence, 
transcription and translation terminators, polyadenylation sequences, transcription and 

15 translation initiation sequences, and promoters useful for transcribing DNA into RNA. 

The expression systems optionally at least one independent terminator sequence, sequences 
permitting replication of the cassette in vivo, e.g., plants, eukaryotes, or prokaryotes, or a 
combination thereof, (e.g., shuttle vectors) and selection markers for the selected 
expression system, e.g., plant, prokaryotic or eukaryotic systems. To ensure proper 

20 polypeptide expression under varying conditions, a polyadenylation region at the 3'-end of 
the coding region can be included (see Li (1997) Plant P/iyjia/.115:321-325, for a review 
of the polyadenylation of RNA in plants). The polyadenylation region can be derived from 
the natural gene, from a variety of other plant genes, or from T-DNA (e.g., using 
Agrobacterium tumefaciens T-DNA replacement vectors, see e.g., Thykjaer (1997) PUmt 

25 Mol Biol. 35:523-530; using a plasmid containing a gene of interest flanked by 

Agrobacterium T-DNA border repeat sequences; Hansen (1997) "T-strand integration in 
maize protoplasts after codelivery of a T-DNA substrate and virulence genes," Proc. Natl. 
Acad. Sci. USA 94:11726-11730. 

To identify the promoters, the 5' portions of the clones described here are 

30 analyzed for sequences characteristic of promoter sequences. For instance, promoter 
sequence elements include the TATA box consensus sequence (TAT A AT), which is 
usually 20 to 30 base pairs upstream of the transcription start site. In plants, further 
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upstream from the TATA box, at positions -80 to -100, there is typically a promoter 
element with a series of adenines surrounding the trinucleotide G (or T) N G (see, e.g., 
Messing, in Genetic Engineering in Plants, pp. 221-227, Kosage, Meredith and 
HoUaender, eds. 1983). If proper polypeptide expression is desired, a polyadenyiation 

5 region at the 3 '-end of the RG coding region should be included. The polyadenyiation 
region can be derived from the natural gene, from a variety of other plant genes, or from 
viral genes, such as T-DNA. 

The nucleic acids of the invention can be expressed in expression cassettes, 
vectors or viruses which are transiently expressed in cells using, for example, episomal 

10 expression systems (e.g., cauliflower mosaic virus (CaMV) viral RNA is generated in the 
nucleus by transcription of an episomal minichromosome containing supercoiled DNA, 
Covey (1990) Proc. Natl Acad. Sci. USA 87:1633-1637). Alternatively, coding sequences 
can be inserted into the host cell genome becoming an integral part of the host 
chromosomal DNA. 

15 Selection markers can be incorporated into expression cassettes and vectors 

to confer a selectable phenotype on transformed cells and sequences coding for episomal 
maintenance and replication such that integration into the host genome is not required. For 
example, the marker may encode biocide resistance, such as antibiotic resistance, 
particularly resistance to chloramphenicol, kanamycin, G418, bleomycin, hygromycin, or 

20 herbicide resistance, such as resistance to chlorosulfuron or Basta, to permit selection of 
those cells transformed with the desired DNA sequences, see for example, 
Blondelet-Rouault (1997) Gene 190:315-317; Aubrecht (1997) 7. Pharmacol Exp. Ther. 
281:992-997 . Because selectable marker genes conferring resistance to substrates like 
neomycin or hygromycin can only be utilized in tissue culture, chemoresistance genes are 

25 also used as selectable markers in vitro and in vivo. See also, Mengiste (1997) 

"High-efficiency transformation of Arabidopsis thaliana with a selectable marker gene 
regulated by the T-DNA V promoter," Plmt J. 12:945-948, showing that the V promoter 
is an attractive alternative to the cauliflower mosaic virus (CaMV) 35S promoter for the 
generation of T-DNA insertion lines, the V promoter may be especially beneficial for the 

30 secondary transformation of transgenic strains containing the 35S promoter to exclude 
homology-mediated gene silencing. 
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The endogenous promoters from the RG genes of the present invention can - 
be used to direct expression of the genes. These promoters can also be used to direct 
expression of heterologous structural genes. The promoters can be used, for example, in 
recombinant expression cassettes to drive expression of genes conferring resistance to any 
S number of pathogens or pests, including fimgi, bacteria, and the like. 
Constitutive Promoters 

In construction of recombinant expression cassettes, vectors, transgenics, of 
the invention, a promoter fragment can be employed to direct expression of the desired 
gene in all tissues of a plant or animal. Promoters that drive expression continuously 
10 under physiological conditions are referred to as "constitutive" promoters and are active 
under most environmental conditions and states of development or cell differentiation. 
Examples of constitutive promoters include those from viruses which infect plants, such as 
the cauliflower mosaic virus (CaMV) 35S transcription initiation region; the V- or T- 
promoter derived from T-DNA of Agrobacterium tumafaciens; the promoter of the tobacco 
15 mosaic virus; and, other transcription initiation regions from various plant genes known to 
those of skill. See also Holtorf (1995) "Comparison of different constitutive and inducible 
promoters for the overexpression of transgenes in Arabidopsis thaliana,^ Plant MoL BioL 
29:637-646. 

Inducible Promoters 

20 Alternatively, a plant promoter may direct expression of the plant disease 

resistance nucleic acid of the invention under the influence of changing environmental 
conditions or developmental conditions. Examples of environmental conditions that may 
effect transcription by inducible promoters include pathogenic attack, anaerobic conditions, 
elevated temperature, drought, or the presence of light. Such promoters are referred to 

25 herein as "inducible" promoters. For example, the invention incorporates the drought- 
inducible promoter of maize (Busk (1997) supra); the cold, drought, and high salt 
inducible promoter from potato (Kirch (1997) Plant MoL BioL 33:897-909). 

Embodbnents of the invention also incorporate use of plant promoters which 
are inducible upon injury or infection to express the invention's plant disease resistance 

30 (RG'i polypeptides. Various embodiments include use of, e.g., the promoter for a tobacco 
(Nicotiana tabacum) sesquiterpene cyclase gene (EAS4 promoter), which is expressed in 
wounded leafs, roots, and stem tissues, and upon infection with microbial pathogens (Yin 
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(1997) Plant Physiol 1 15(2):437-451); the 0RF13 promoter from Agrobacterium 
rhizogenes 8196, which is wound inducible in a limited area adjacent to the wound site 
(Hansen (1997) MoL Gen. Genet, 254:337-343); the Shpx6b gene promoter, which is a 
plant peroxidase gene promoter induced by microbial pathogens (demonstrated using a 

5 fungal pathogen, see Curtis (1997) MoL Plant Microbe Interact. 10:326-338); the 

wound-inducible gene promoter wunl, derived from potato (Siebertz (1989) Plant Cell 
1:961-968); the wound-inducible Agrobacterium pmas gene (mannopine synthesis gene) 
promoter (Guevara-Garcia (1993) Plant J. 4:495-505). 

Alternatively, plant promoters which are inducible upon exposure to plant 

10 hormones, such as auxins, are used to express the nucleic acids of the invention. For 
example, the invention can use the auxin-response elements El promoter fragment 
(AuxREs) in the soybean {Glycine max L.) (Liu (1997) Plant Physiol. 115:397-407); the 
auxin-responsive Arabidopsis GST6 promoter (also responsive to salicylic acid and 
hydrogen peroxide) (Chen (1996) Plant /. 10: 955-966); the auxin-inducible parC 

15 promoter from tobacco (Sakai (1996) 37:906-913); a plant biotin response element (Streit 
(1997) MoL Plant Microbe Interact. 10:933-937); and, the promoter responsive to the 
stress hormone abscisic acid (Sheen (1996) Science 274:1900-1902). 

Plant promoters which are inducible upon exposure to chemicals reagents 
which can be applied to the plant, such as herbicides or antibiotics, are also used to express 

20 the nucleic acids of the invention. For example, the maize In2-2 promoter, activated by 
benzenesulfonamide herbicide safeners, can be used (De Veylder (1997) Plant Cell 
Physiol. 38:568-577); application of different herbicide safeners induces distinct gene 
expression patterns, including expression in the root, hydathodes, and the shoot apical 
merisiem. Coding sequence can be under the control of, e.g., a tetracycline-inducible 

25 promoter, e.g., as described with transgenic tobacco plants contaimng the Avena sativa L. 
(oat) arginine decarboxylase gene (Masgrau (1997) Plant J. 11:465-473); or, a salicylic 
acid-responsive element (Stange (1997) Ptonr 7. 11:1315-1324. Using chemically- (e.g., 
hormone- or pesticide-) induced promoters, harvesting of fruits and plant parts would be 
greatly facilitated. A chemical which can be applied to the transgenic plant in the field and 

30 induce expression of a polypeptide of the invention throughout all or most of the plant 
would make a environmentally safe defoliant or herbicide. Thus, the invention also 
pro\ides for transgenic plants containing an inducible gene encoding for the RG 
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polypeptides of the invention whose host range is limited to target plant species, such as 
weeds or crops before, during or after harvesting. 

Abcission promoters are activated upon plant ripening, such as fruit 
ripening, and are especially useful incorporated in the expression systems {e,g,, expression 

S cassettes, vectors) of the invention. In some embodiments, when a plant disease resistant 
polypeptide-encoding nucleic acid is under the control of such a promoter, rapid cell death, 
induced by expression of the invention's polypeptide, can accelerate and/or accentuate 
abcission, mcreasing the efficiency of the harvesting of fruits or other plant parts, such as 
cotton, and the like. Induction of rapid cell death at this time would accelerate separation 

10 of the fruit from the plant, greatly augmenting harvesting procedures. See, e.g., Kalaitzis 
(1997) Plant Physiol. 113:1303-1308, discussing tomato leaf and flower abscission; Payton 
(1996) Plant Mol. Biol. 31:1227-1231, discussing ethylene receptor expression regulation 
during fruit ripening, flower senescence and abscission; Koehler (1996) Plant MoL Biol. 
31:595-606, discussing the gene promoter for a bean abscission cellulase; Kalaitzis (1995) 

15 Plant Mol, BioLlS: 647-656, discussing cloning of a tomato polygalacturonase expressed 
in abscission; del Campillo (1996) Plant Physiol. 111:813-820, discussing pedicel 
breakstrength and cellulase gene expression during tomato flower abscission. 
Tissue-Specific Promoters 

Tissue specific promoters are transcriptional control elements that are only 

20 active in particular cells or tissues. Plant promoters which are active only in specific 

tissues or at specific times during plant development are used to express the nucleic acids 
of the invention. Examples of promoters under developmental control include promoters 
that initiate transcription only in certain tissues, such as leaves, roots, fruit, seeds, ovules, 
pollen, pistols, or flowers. Such promoters are referred to as "tissue specific". The 

25 operation of a promoter may also vary depending on its location in the genome. Thus, an 
inducible promoter may become fully or partially constitutive in certain locations. 

For example, a seed-specific promoter directs expression in seed tissues. 
Such promoters may be, for example, ovule-specific, embryo-specific, endosperm-specific, 
integument-specific, seed coat-specific, or some combination thereof. A leaf-specific 

30 promoter has been identified in maize. Busk (1997) Plant /. 11:1285-1295. The ORF13 
promoter from Agrobacterium rhizogenes exhibits high activity in roots (Hansen (1997) 
supra). A maize pollen-specific promoter has been identified in maize (Guerrero (1990) 
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MoL Gen. Genet, 224:161-168). A tomato promoter active during fruit ripening, 
senescence and abscission of leaves and, to a lesser extent, of flowers can be used (Blume 
(1997) Plant 7. 12:731-746). A pistol specific promoter has been identified in the potato 
{Solanum tuberosum L.) SK2 gene, encoding a pistil-specific basic endochitinase (Ficker 
5 (1997) Plant MoL Biol. 35:425-431). The Blec4 gene firom pea {Pisum sativum cv. 
Alaska) is active in epidermal tissue of vegetative and floral shoot apices of transgenic 
alfalfa, making it a useful tool to target the expression of foreign genes to the epidermal 
layer of actively growing shoots. The activity of the Blec4 promoter in the epidermis of the 
shoot apex makes it particularly suitable for genetically engineering defense against insects 
10 and diseases that attack the growing shoot apex (Mandaci (1997) Plant Mol BioL 
34:961-965). 

The invention also provides for use of tissue-specific plant promoters 
include a promoter from the ovule-specific BELl gene described in Reiser (1995) Cell 
83:735-742, GenBank No. U39944. Suitable seed specific promoters are derived from the 

15 following genes: MACl from maize, Sheridan (1996) Genetics 142:1009-1020; Cat3 from 
maize, GenBank No. LX)5934, Abler (1993) Plant MoL BioL 22:10131-1038; the gene 
encoding oleosm 18kD from maize, GenBank No. J05212, Lee (1994) Plant MoL BioL 
26:1981-1987; vivparous-1 from Arabidopsis, Genbank No. U93215; the gene encoding 
oleosin from Arabidopsis, Genbank No. Z17657; Atmycl from Arabidopsis, Urao (1996) 

20 Plant MoL BioL 32:571-576; the 2s seed storage protein gene family from Arabidopsis, 
Conceicao (1994) Plant 5:493-505; the gene encoding oleosin 20kD from Brassica napus, 
GenBank No. M63985; napA from Brassica napus, GenBank No. J02798, Josefsson 
(1987) JBL 26:12196-1301; the napin gene family from Brassica napus, Sjodahl (1995) 
Planta 197:264-271; the gene encoding the 2S storage protein ft-om Brassica napus, 

25 Dasgupta (1993) Gene 133:301-302; the genes encoding oleosin a, Genbank No. U09118, 
and, oleosin B, Genbank No. U09119, from soybean; and, the gene encoding low 
molecular weight sulphur rich protein firom soybean, Choi (1995) Mol Gen, Genet. 
246:266-268. The tissue specific E8 promoter firom tomato is particularly usefiil for 
directing gene expression so that a desured gene product is located in fruits. Other suitable 

30 promoters include those from genes encoding embryonic storage proteins. 

One of skill will recognize that a tissue-specific promoter may drive 
expression of operably linked sequences in tissues other than the target tissue. Thus, as 
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used herein a tissue-specific promoter is one that drives expression preferentially in the 
target tissue, but may also lead to some expression in other tissues as well. 

The invention also provides for use of tissue-specific promoters derived 
from viruses which can include, e.g., the tobamovirus subgenomic promoter (Kumagai 

5 (1995) Proc, NatL Acad. ScL USA 92:1679-1683; the rice tungro bacilliform virus 

(RTBV), which replicates only in phloem cells in infected rice plants, with its promoter 
which drives strong phloem-specific reporter gene expression; the cassava vein mosaic 
virus (CVMV) promoter, with highest activity in vascular elements, in leaf mesophyll 
cells, and in root tips (Verdaguer (1996) Plant MoL Biol. 31:1129-1139). 

10 In some embodiments, the nucleic acid construct will comprise a promoter 

functional in a specific plant cell, such as in a species of Lactuca, operably linked to an RG 
polynucleotide. Promoters useful in these embodiments include RG promoters. In 
additional embodiments, the nucleic acid construct will comprise a RG promoter operably 
linked to a heterologous polynucleotide. The heterologous polynucleotide is chosen to 

15 provide a plant with a desired phenotype. For example, the heterologous polynucleotide 
can be a structural gene which encodes a polypeptide which imparts a desired resistance 
phenotype. Alternatively, the heterologous polynucleotide may be a regulatory gene which 
might play a role in transcriptional and/or translational control to suppress, enhance, or 
otherwise modify the transcription and/or expression of an endogenous gene within the 

20 plant. The heterologous polynucleotide of the nucleic acid construct of the present 

invention can be expressed in either sense or anti-sense orientation as desired. It will be 
appreciated that control of gene expression in either sense or anti-sense orientation can 
have a direct impact on the observable plant characteristics. 
Modifying and Inhibiting RG Gene Expression 

25 The invention also provides for RG nucleic acid sequences which are 

complementary to the RG polypeptide-encoding sequences of the invention; i.e., antisense 
RG nucleic acids. Antisense technology can be conveniently used to modify gene 
expression in plants. To accomplish this, a nucleic acid segment from the desired gene is 
cloned and operably linked to a promoter such that the anti-sense strand of RNA will be 

30 transcribed. The construct is then transformed into plants and the antisense strand of RNA 
is produced. In plant cells, it has been shown that antisense RNA inhibits gene expression 
by preventing the accumulation of mRNA which encodes the enzjmie of interest, see, e.g., 
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Sheehy (1988) Proc. Nat. Acad. Set USA 85:8805-8809; Hiatt et al., U.S. Patent No. 
4,801,340. 

Antisense sequences are capable of inhibiting the transport, splicing or 
transcription of RG-encoding genes. The inhibition can be effected through the targeting 

5 of genomic DNA or messenger RNA. The transcription or function of targeted nucleic 
acid can be inhibited, e.g., by hybridization and/or cleavage. One particularly useful set 
of inhibitors provided by the present invention includes oligonucleotides which are able to 
either bind RG gene or message, in either case preventing or inhibiting the production or 
function of RG. The association can be though sequence specific hybridization. Such 

10 inhibitory nucleic acid sequences can, for example, be used to completely inhibit a plant 
disease resistance response. Another useful class of inhibitors includes oligonucleotides 
which cause inactivation or cleavage of RG message. The oligonucleotide can have 
enz>Tne activity which causes such cleavage, such as ribozymes. The oligonucleotide can 
be chemically modified or conjugated to an enzyme or composition capable of cleaving the 

15 complementary nucleic acid. One may screen a pool of many different such 
oligonucleotides for those with the desired activity. 
Antisense Oligonucleotides 

The invention provides for with antisense oligonucleotides capable of 
binding RG message which can inhibit RG activity by targeting mRNA. Strategies for 

20 designing antisense oligonucleotides are well described in the scientific and patent 
' literamre, and the skilled artisan can design such RG oligonucleotides usmg the novel 
reagents of the invention. In some situations, naturally occurring nucleic acids used as 
antisense oligonucleotides may need to be relatively long (18 to 40 nucleotides) and present 
at high concentrations. A wide variety of synthetic, non-naturally occurring nucleotide and 

25 nucleic acid analogues are known which can address this potential problem. For example, 
peptide nucleic acids (PNAs) containing non-ionic backbones, such as N-(2-aminoethyl) 
glycine units can be used. Antisense oligonucleotides having phosphorothioate linkages 
can also be used, as described in WO 97/03211; WO 96/39154; Mata (1997) Toxicol Appl 
Pharmacol 144:189-197; Antisense Therapeutics, ed. Agrawal (Humana Press, Totowa, 

30 N.J., 1996). Antisense oligonucleotides having synthetic DNA backbone analogues 
provided by the invention can also include phosphoro-dithioate, methylphosphonate, 
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phosphoramidate, alkyl phosphotriester, sulfamate, 3'-thioacetal. niethylene(methylimino), , 
3'-N-carbamate, and morpholino carbamate nucleic acids, as described herein. 

Combinatorial chemistry methodology can be used to create vast numbers of 
oligonucleotides that can be rapidly screened for specific oligonucleotides that have 
appropriate binding affinities and specificities toward any target, such as the sense and 
antisense RG sequences of the invention (for general background information, see, e.g., 
Gold (1995) J. ofBioL Chem. 270:13581-13584). 

Inhibitory Ribozymes 

The invention provides for with ribozymes capable of binding RG message 
which can inhibit RG activity by targeting mRNA. Strategies for designing ribozymes and 
selecting the RG-specific antisense sequence for targeting are well described in the 
scientific and patent literature, and the skilled artisan can design such RG ribozymes using 
the novel reagents of the invention. Ribozymes act by binding to a target RNA through the 
target RNA binding portion of a ribozyme which is held in close proximity to an enzymatic 
portion of the RNA that cleaves the target RNA. Thus, the ribozyme recognizes and binds 
a target RNA through complementary base-pairing, and once bound to the correct site, acts 
enzytnatically to cleave and inactivate the target RNA. Cleavage of a target RNA in such a 
manner will destroy its ability to direct synthesis of an encoded protein if the cleavage 
occurs in the coding sequence, or, preventing transport of the message from the nucleus to 
the cytoplasm. After a ribozyme has bound and cleaved its RNA target, it is typically 
released from that RNA and so can bind and cleave new targets repeatedly. 

Catalytic RNA molecules or ribozymes can also be used to inhibit 
expression of any plant gene. It is possible to design ribozymes that specifically pair with 
virtually any target RNA and cleave the phosphodiester backbone at a specific location, 
thereby functionally uiactivatmg the target RNA. In carrying out this cleavage, the 
ribozyme is not itself altered, and is thus capable of recycling and cleaving other 
molecules, making it a true enzyme. The inclusion of ribozyme sequences within antisense 
RNAs confers RNA-cleaving activity upon them, thereby increasing the activity of the 
consrructs. The design and use of target RNA-specific ribozymes is described, e.g., in 
Haseioff (1988) Nature 334:585-591. 

In some circmnstances, the enzymatic namre of a ribozyme can be 
advantageous over other technologies, such as antisense technology (where a nucleic acid 
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molecule simply binds to a nucleic acid target to block its transcription, translation or 
association with another molecule) as the effective concentration of ribozyme necessary to 
effect a therapeutic treatment can be lower than that of an antisense oligonucleotide. This 
potential advantage reflects the ability of the ribozyme to act enzymatically. Thus, a single 
5 ribozyme molecule is able to cleave many molecules of target RNA. In addition, a 
ribozyme is typically a highly specific inhibitor, with the specificity of inhibition 
depending not only on the base pairing mechanism of binding, but also on the mechanism 
by which the molecule inhibits the expression of the RNA to which it binds. That is, the 
inhibition is caused by cleavage of the RNA target and so specificity is defined as the ratio 

10 of the rate of cleavage of the targeted RNA over the rate of cleavage of non-targeted RNA. 
This cleavage mechanism is dependent upon factors additional to those involved in base 
pairing. Thus, the specificity of action of a ribozyme can be greater than that of antisense 
oligonucleotide binding the same RNA site. 

The enzymatic ribozyme RNA molecule can be formed in a hammerhead 

15 motif, but may also be formed in the motif of a hairpin, hepatitis delta virus, group I intron 
or RNaseP-like RNA (in association with an RNA guide sequence). Examples of such 
hammerhead motifs are described by Rossi (1992) Aids Research and Human Retroviruses 
8:183; hairpin motifs by Hampel (1989) Biochemistry 28:4929, and Hampel (1990) Nuc. 
Acids Res. 18:299; the hepatitis delta virus motif by Perrotta (1992) Biochemistry 31:16; 

20 the RNaseP motif by Guerrier-Takada (1983) Cell 35:849; and the group I intron by Cech 
U.S. Pat. No. 4,987,071. The recitation of these specific motifs is not intended to be 
limiting; those skilled in the art will recognize that an enzymatic RNA molecule of this 
invention has a specific substrate binding site complementary to one or more of the target 
gene RNA regions, and has nucleotide sequence within or surrounding that substrate 

25 bmding site which inq)arts an RNA cleaving activity to the molecule. 
Sense Supression 

Another method of suppression is sense suppression. Introduction of 
nucleic acid configured in the sense orientation has been shown to be an effective means by 
which to block the transcription of target genes. For an example of the use of this method 
30 to modulate expression of endogenous genes see, Napoli et al., TTie Plant Cell 2:279-289 
(1990), and U.S. Patent No. 5,034,323. 
Cloning ofRG Polypeptides 
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Synthesis and/or cloning of RG polynucleotides and isolated nucleic acid 
constructs of the present invention are provided by methods well known to those of 
ordinary skill in die art. Generally, the nomenclature and the laboratory procedures in 
recombinant DNA technology described below are those well known and commonly 
employed in the art. Standard techniques are used for clomng, DNA and RNA isolation, 
amplification and purification. Generally enzymatic reactions involving DNA ligase, DNA 
polymerase, restriction endonucleases and the like are performed according to the 
manufacturer's specifications. These techniques and various other techniques are generally 
performed according to Sambrook et aL , Molecular Cloning - A Laboratory Manual, Cold 
Spring Harbor Laboratory, Cold Spring Harbor, New York, (1989). 

The isolation of RG genes may be accomplished by a number of techniques. 
For instance, oligonucleotide probes based on the sequences disclosed here can be used to 
identify the desired gene in a cDNA or genomic DNA library. To construct genomic 
libraries, large segments of genomic DNA are generated by random fragmentation, e.g. 
using restriction endonucleases, and are ligated with vector DNA to form concatemers that 
can be packaged into the appropriate vector. To prepare a cDNA library, mRNA is 
isolated from the desked organ, such as roots and a cDNA library which contains the RG 
gene transcript is prepared from the mRNA. Alternatively, cDNA may be prepared from 
mRNA extracted from other tissues in which RG genes or homologs are expressed. 

The cDNA or genomic library can then be screened using a probe based 
upon the sequence of a cloned RG gene such as the genes disclosed herein. Probes may be 
used to hybridize with genomic DNA or cDNA sequences to isolate homologous genes in 
the same or different plant species. 

Those of skill in the art will appreciate that various degrees of stringency of 
hybridization can be employed in the assay; and either the hybridization or the wash 
meditmfi can be stringent. As the conditions for hybridization become more stringent, 
there must be a greater degree of complementarity between the probe and the target for 
duplex formation to occur. The degree of stringency can be controlled by temperamre, 
ionic strength, pH and the presence of a partially denaturing solvent such as formamide. 
For example, the stringency of hybridization is conveniently varied by changing the po- 
larity of the reactant solution through manipulation of the concentration of formamide 
within the range of 0% to 50%. 
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Alternatively, the RG nucleic acids of the invention can be amplified from 
nucleic acid samples using a variety of amplification techniques, such as polymerase chain 
reaction (PCR) technology, to amplify the sequences of the RG and related genes directly 
from genomic DNA, from cDNA, from genomic libraries or cDNA libraries. PCR and 
5 other in vitro amplification methods may also be useful, for example, to clone nucleic acid 
sequences that code for proteins to be expressed, to make nucleic acids to use as probes for 
detecting the presence of the desired mRNA in samples, for nucleic acid sequencing, or for 
other purposes. 

Oligonucleotides can be used to identify and detect additional RG families 

10 and RG family species using a variety of hybridization techniques and conditions. Suitable 
amplification methods include* but are not limited to: polymerase chain reaction, PCR 
(PCR Protocols, A Guide to Methods and Applications, ed. Innis, Academic Press, 
N.Y. (1990) and PCR STRATEGIES (1995), ed. Innis, Academic Press, Inc., N.Y. aimis 
)), ligase chain reaction (LCR) (Wu (1989) Genomics 4:560; Landegren (1988) Science 

15 241 : 1077; Barringer (1990) Gene 89: 117); transcription amplification (Kwoh (1989) Proc. 
Natl. Acad. Sci. USA 86:1173); and, self-sustained sequence replication (Guatelli (1990) 
Proc. Natl. Acad. Sci. USA, 87:1874); Q Beta replicase amplification and other RNA 
polymerase mediated techniques (e.g., NASBA, Cangene, Mississauga, Ontario); see 
Berger (1987) Methods Enzymol. 152:307-316, Sambrook, and Ausubel, as well as MuUis 

20 (1987) U.S. Patent Nos. 4,683,195 and 4,683,202; Arnheim (1990) C&EN 36-47; Lomell 
/. Clin. Chem., 35:1826 (1989); Van Brunt, Biotechnology, 8:291-294 (1990); Wu (1989) 
Gene 4:560; Sooknanan (1995) Biotechnology 13:563-564. Methods for cloning in vitro 
amplified nucleic acids are described in Wallace, U.S. Pat. No. 5,426,039. 

The degree of complementarity (sequence identity) required for detectable 

25 binding will vary in accordance with the stringency of the hybridization medium and/or 

wash medium. The degree of complementarity will optimally be 100 percent; however, it 
should be understood that minor sequence variations in the probes and primers may be 
compensated for by reducing the stringency of the hybridization and/or wash medium as 
described earlier. 

30 In some preferred embodhnents, members of this class of pest resistance 

genes can be identified by their ability to be amplified by PCR primers based on the 
sequences disclosed here. Appropriate primers and probes for identifying RG sequences 
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from plant tissues are generated from comparisons of the sequences provided herein. See, . 
e.g. , Table 1 . For a general overview of PGR see PCR Protocols: A Guide to Methods 
and Applications, (Innis, M, Gelfand, D., Sninsky, J. and White. T., eds.). Academic 
Press, San Diego (1990), incorporated herein by reference. 

5 Briefly, the first step of each cycle of the PCR involves the separation of the 

nucleic acid duplex formed by the primer extension. Once the strands are separated, the 
next step in PCR involves hybridizing the separated strands with primers that flank the 
target sequence. The primers are then extended to form complementary copies of the 
target strands. For successful PCR amplification, the primers are designed so that the 

10 position at which each primer hybridizes along a duplex sequence is such that an extension 
product synthesized from one primer, when separated from the template (complement), 
serves as a template for the extension of the other primer. The cycle of denaturation, 
hybridization, and extension is repeated as many times as necessary to obtain the desired 
amount of amplified nucleic acid. 

15 In the preferred embodiment of the PCR process, strand separation is 

achieved by heating the reaction to a sufficiently high temperature for an sufficient time to 
cause the denamration of the duplex but not to cause an irreversible denaturation of the 
polymerase (see U.S. Patent No. 4,965,188). Template-dependent extension of pruners in 
PCR is catalyzed by a polymerizing agent in the presence of adequate amounts of four 

20 deoxyribonucleotide triphosphates (typically dATP. dGTP, dCTP, and dTTP) in a reaction 
medium comprised of the appropriate salts, metal cations, and pH buffering system. 
Suitable polymerizing agents are enzymes known to catalyze template-dependent DNA 
synthesis. 

Polynucleotides may also be synthesized by well-known techniques as 
25 described in the technical literature. See, e.g. , Carruthers et al , Cold Spring Harbor 
Symp, Quant. Biol 47:411-418 (1982), and Adams etal.J, Am. Chem, Soc, 105:661 
(1983). Double stranded DNA fragments may then be obtained either by synthesizing the 
complementary strand and annealing the strands together under appropriate conditions, or 
by adding the complementary strand using DNA polymerase with an appropriate primer 
30 sequence. 



RG Proteins 
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The present invention further provides isolated RG proteins encoded by the . 
RG polynucleotides disclosed herem. One of skill will recognize that the nucleic acid 
encoding a functional RG protein need not have a sequence identical to the exemplified 
genes disclosed here. For example, because of codon degeneracy a large mraiber of 

5 nucleic acid sequences can encode the same polypeptide. In addition, the polypeptides 
encoded by the RG genes, like other proteins, have different domains which perform 
different functions. Thus, the RG gene sequences need not be full length, so long as the 
desired functional domain of the protein is expressed. 

The resistance protems are at least 25 ammo acid residues in length. 

10 Typically, the RG proteins are at least 50 amino acid residues, generally at least 100, 

preferably at least 150, more preferably at least 200 amino acids in length. In particularly 
preferred embodiments, the RG proteins are of sufficient length to provide resistance to 
pests when expressed in the desired plants. Generally then, the RG proteins will be the 
length encoded by an RG gene of the present invention. However, those of ordinary skill 

15 will appreciate that minor deletions, substitutions, or additions to an RG protein will 

typically yield a protein with pest resistance characteristics similar or identical to that of 
the full length sequence. Thus, full-length RG proteins modified by 1, 2, 3, 4, or 5 
deletions, substitutions, or additions, generally provide an effective degree of pest 
resistance relative to the full-length protein. 

20 The RG proteins which provide pest resistance will typically comprise at 

least one of an LRR or an NBS. Preferably, both are present. LRR and/or NBS regions 
present in the RG proteins of the present invention can be provided by RG genes of the 
present invention. In some embodiments, the LRR and/or NBS regions are obtained from 
other pest resistance genes. See, e.g., Yu et aL, Proc. Natl, Acad, ScL USA, 93: 11751- 

25 11756 (1996); Bent et aL, Science, 265: 1856-1860 (1994). 

Modified protein chains can also be readily designed utilizing various 
recombinant DNA techniques well known to those skilled in the art. For example, the 
chains can vary from the naturally occurring sequence at the primary structure level by 
amino acid substitutions, additions, deletions, and the like. Modification can also include 

30 swapping domains from the proteins of the invention with related domains from other pest 
resistance genes. 
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Pests that can be targeted by RG genes and proteins of the present invention . 
include such bacterial pests as Erwinia carotovora and Pseudomonas marginalis. Fungal 
pests which can be targeted by the present invention include Bremia lactucaCj Marssonina 
pancmoniana, Rhizoctonia solani^ Olpidium brassicae, root aphid, Sclerotinia 
5 sclerotiorum and S. minor, and Botrytis cinerea which causes gray mold. RG genes also 
provide resistance to viral diseases such as lettuce and turnip mosaic viruses. 
Fusion Proteins 

RG polypeptides can also be expressed as recombuiant proteins with one or 
more additional polypeptide domains linked thereto to facilitate protein detection, 

10 purification, or other applications. Such detection and purification facilitating domains 
include, but are not limited to, metal chelating peptides such as polyhistidine tracts and 
histidine-tryptophan modules that allow purification on immobilized metals, protein a 
domains that allow purification on immobilized immunoglobulin, and the domain utilized 
in the FLAGS extension/affinity purification system (Immunex Corp, Seattle WA). The 

15 inclusion of a cleavable linker sequences such as Factor Xa or enterokinase (Invitrogen, 
San Diego CA) between the purification domain and plant disease resistant polypeptide 
may be useful to facilitate purification. One such expression vector provides for 
expression of a fusion protein comprising the sequence encoding a plant disease resistant 
polypeptide of the invention and nucleic acid sequence encoding six histidine residues 

20 followed by thioredoxin and an enterokinase cleavage site (e.g., see Williams (1995) 

Biochemistry 34:1787-1797). The histidine residues facilitate detection and purification 
while the enterokinase cleavage site provides a means for purifying the desired protein(s) 
from the remainder of the fiision protein. Technology pertaining to vectors encoding 
fusion proteins and application of fiision proteins are well described, see e.g., KroU 

25 (1993) DMA Cell Biol, 12:441-53. 

Antibodies Reactive to RG Polypeptides and Immunological Assays 

The present invention also provides antibodies which specifically react with 
RG proteins of the present invention under inrniunologically reactive conditions. An 
antibody inrniunologically reactive with a particular antigen can be generated in vivo or by 

30 recombinant methods such as selection of libraries of recombinant antibodies in phage or 
similar vectors. "Immunologically reactive conditions" includes reference to conditions 
which allow an antibody, generated to a particular epitope of an antigen, to bind to that 
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epitope to a detectably greater degree than the antibody binds to substantially all other 
epitopes, generally at least two times above background binding, preferably at least five 
times above background. Inmiunologically reactive conditions are dependent upon the 
format of the antibody binding reaction and typically are those utilized in immunoassay 
5 protocols. 

"Antibody" includes reference to an inununoglobulin molecule obtained by 
in vitro or in vivo generation of the humoral response, and includes both polyclonal and 
monoclonal antibodies. The term also includes genetically engineered forms such as 
chimeric antibodies (e.g., humanized murine antibodies), heteroconjugate antibodies (e.g., 

10 bispecific antibodies), and recombinant single cham Fv fragments (scFv). The term 

"antibody" also includes antigen binding forms of antibodies (e.g.. Fab', F(Bb% Fab, Fv, 
rIgG, and, inverted IgG). See, Pierce Catalog and Handbook, 1994-1995 (Pierce 
Chemical Co., Rockford, IL). An antibody immunologically reactive with a particular 
antigen can be generated in vivo or by recombinant methods such as selection of libraries 

15 of recombinant antibodies in phage or similar vectors. See, e.g., Huse et al. (1989) 

Science 246:1275-1281; and Ward, et al. (1989) Nature 341:544-546; and Vaughan etal. 
(1996) Nature Biotechnology, 14:309-314. 

Many methods of making antibodies are known to persons of skill. A 
number of immunogens are used to produce antibodies specifically reactive to an isolated 

20 RG protein of the present invention under immunologically reactive conditions. An 
isolated recombinant, synthetic, or native RG protein of the present invention is the 
preferred hnmunogens (antigen) for the production of monoclonal or polyclonal antibodies. 

The RG protein is then injected into an animal capable of producing 
antibodies. Either monoclonal or polyclonal antibodies can be generated for subsequent 

25 use in immunoassays to measure the presence and quantity of the RG protein. Methods of 
producmg monoclonal or polyclonal antibodies are known to those of skill in the art. See, 
e.g., Coligan (1991) Current Protocols in Immunology Wiley/Greene, NY; and Harlow 
and Lane (1989) Antibodies: A Laboratory Manual Cold Spring Harbor Press, NY); 
Goding (1986) Monoclonal Antibodies: Principles and Practice (2d ed.) Academic Press, 

30 New York, NY. 

Frequently, the RG proteins and antibodies will be labeled by joining, either 
covaiently or non-covalently, a substance which provides for a detectable signal. A wide 
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variety of labels and conjugation techniques are known and are reported extensively in both 
the scientific and patent literature. Suitable labels include radionucleotides, enzymes, 
substrates, cofactors, inhibitors, fluorescent moieties, chemiluminescent moieties, 
magnetic particles, and the like. Patents teaching the use of such labels include U.S. 
5 Patent Nos. 3,817,837; 3,850,752; 3.939,350; 3,996,345; 4,277,437; 4,275,149; and 
4,366,241. 

The antibodies of the present invention can be used to screen plants for the 
expression of RG proteins of the present invention. The antibodies of this invention are 
also used for affinity chromatography in isolating RG protein. 

10 The present invention further provides RG polypeptides that specifically 

bind, under immunologically reactive conditions, to an antibody generated against a 
defined immunogen, such as an immunogen consisting of the RG polypeptides of the 
present invention. Inmiunogens will generally be at least 10 contiguous amino acids from 
an RG polypeptide of the present invention. Optionally, hnmunogens can be from regions 

15 exclusive of the NBS and/or LRR regions of the RG polypeptides. Nucleic acids which 
encode such cross-reactive RG polypeptides are also provided by the present invention. 
The RG polypeptides can be isolated from any number plants as discussed earlier. 
Preferred are species from the family Compositae and in particular the genus Lactuca such 
as L. sativa and such subspecies as crispa, longifolia, and asparagina, 

20 "Specifically binds" includes reference to the preferential association of a 

ligand, in whole or part, with a particular target molecule {i.e., "buiding partner" or 
"binding moiety") relative to compositions lacking that target molecule. It is, of course, 
recognized that a certain degree of non-specific interaction may occur between a ligand and 
a non-target molecule. Nevertheless, specific binding, may be distinguished as mediated 

25 through specific recognition of the target molecule. Typically specific binding results in a 
much stronger association between the ligand and the target molecule than between the 
ligand and non-target molecule. Specific binding by an antibody to a protein under such 
conditions requires an antibody that is selected for its specificity for a particular protein. 
The affinity constant of the antibody binding site for its cognate monovalent antigen is at 

30 least 10^, usually at least itf , preferably at least itf . more preferably at least 10^®, and 
most preferably at least 10*' liters/mole. A variety of immunoassay formats are 
appropriate for selecting antibodies specifically reactive with a particular protein. For 
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example, solid-phase ELISA immunoassays are routinely used to select monoclonal 
antibodies specifically reactive with a protein. See Harlow and Lane (1988) Antibodies, A 
Laboratory Manual, Cold Sprmg Harbor Publications, New York, for a description of. 
immunoassay formats and conditions that can be used to determine specific reactivity. The 

5 antibody may be polyclonal but preferably is monoclonal. Generally, antibodies cross- 
reactive to such proteins as RPS2, RPMl (bacterial resistances m Arabidopsis. L6 (fungal 
resistance in flax, PRF (resistance to Pseudomonas syringae in tomator), and N, (virus 
resistance in tobacco), are removed by immunoabsorbtion. 

Immtmoassays in the competitive binding format are typically used for 

10 cross-reactivity determinations. For example, an immunogenic RG polypeptide is 

inmiobilized to a solid support. Polypeptides added to the assay compete with the binding 
of the antisera to the immobilized antigen. The ability of the above polypeptides to 
compete with the binding of the antisera to the immobilized RG polypeptide is compared to 
the immunogenic RG polypeptide. The percent cross-reactivity for the above proteins is 

15 calculated, using standard calculations. Those antisera with less than 10% cross-reactivity 
with such proteins as RPS2, RPMl, L6, PRF, and N, are selected and pooled. The cross- 
reaciing antibodies are then removed from the pooled antisera by immunoabsorbtion with 
these non-RG resistance proteins. 

The immunoabsorbed and pooled antisera are then used in a competitive 

20 binding immunoassay to compare a second "target" polypeptide to the immunogenic 

polypeptide. In order to make this comparison, the two polypeptides are each assayed at a 
wide range of concentrations and the amount of each polypeptide required to inhibit 50% 
of the binding of the antisera to the immobilized protein is determined using standard 
techniques. If the amoimt of the target polypeptide required is less than twice the amount 

25 of the immunogenic polypeptide that is required, then the target polypeptide is said to 
specifically bind to an antibody generated to the immunogenic protein. As a final 
determination of specificity, the pooled antisera is fully immunosorbed with the 
immunogenic polypeptide until no binding to the polypeptide used in the inununosorbtion 
is detectable. The fiilly immunosorbed antisera is then tested for reactivity with the test 

30 polypeptide. If no reactivity is observed, then the test polypeptide is specifically bound by 
the antisera elicited by the immunogenic protein. 
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Production of transpenic plants of the invention 

Isolated nucleic acid constructs prepared as described herein can be 
introduced into plants according techniques known in the art. In some embodiments, the 
introduced nucleic acid is used to provide RG gene expression and therefore pest resistance 
in desired plants. In some embodiments, RG promoters are used to drive expression of 
desired heterologous genes in plants. Finally, in some embodiments, the constructs can be 
used to suppress expression of a target endogenous gene, including RG genes. 

To use isolated RG sequences in the above techniques, recombinant DNA 
vectors suitable for transformation of plant cells are prepared. Techniques for 
transforming a wide variety of higher plant species are well known and described in the 
technical and scientific literatore. See, for example, Weising et al Ann. Rev. Genet. 
22:421-477 (1988). 

A DNA sequence coding for the desired RG polypeptide, for example a 
cDNA or a genomic sequence encoding a full length protein, will be used to construct a 
recombinant expression cassette which can be introduced into the desired plant. An 
expression cassette will typically comprise the RG polynucleotide operably linked to 
transcriptional and translational initiation regulatory sequences which will direct the 
transcription of the sequence from the RG gene in the intended tissues of the transformed 
plant. 

Such DNA constructs may be introduced into the genome of the desired 
plant host by a variety of conventional techniques. For example, the DNA construct may 
be imroduced directly into the genomic DNA of the plant cell using techniques such as 
electroporation, PEG poration, particle bombardment and microinjection of plant cell 
protoplasts or embryogenic callus, or the DNA constructs can be introduced directly to 
plant tissue using ballistic methods, such as DNA particle bombardment. Alternatively, 
the DNA constructs may be combined with suitable T-DNA flanking regions and 
introduced into a conventional Agrobacterium tumefaciens host vector. The virulence 
functions of the Agrobacterium tumefaciens host will direct the insertion of the construct 
and adjacent marker into the plant cell DNA when the cell is infected by the bacteria. 

Transformation techniques are known in the art and well described in the 
scientific and patent literature. The introduction of DNA constructs using polyethylene 
glycol precipitation is described in Paszkowski et al. Embo J. 'i:!! 11-1121 (1984). 
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Electroporation techniques are described in Fromm et al Proc, NatL Acad. Sci, USA 
82:5824 (1985). Ballistic transformation techniques are described in Klein et ai Nature 
327:70-73 (1987). 

Agrobacterium f«me/ac/m-meditated transformation techniques are well 

5 described in the scientific literature. See, for example Horsch et aL Science 233:496-498 
(1984), and Fraley etal, Proc. NatL Acad, ScL USA 80:4803 (1983). Although 
Agrobacterium is useful primarily in dicots, certain monocots can be transformed by 
Agrobacterium, For instance, Agrobacterium transformation of rice is described by Hiei et 
aly Plant J. 6:271-282 (1994). A particularly preferred means of transforming lettuce is 

10 described in Michetaiore et aL, Plant Cell Reports, 6:439-442 (1987). 

Transformed plant cells which are derived by any of the above 
transformation techniques can be cultured to regenerate a whole plant which possesses the 
transformed genotype and thus the desired RG-controlled phenotype. Such regeneration 
techniques rely on manipulation of certain phytohormones in a tissue culture growth 

15 mediiun, typically relying on a biocide and/or herbicide marker which has been introduced 
together with the RG nucleotide sequences. Plant regeneration from cultured protoplasts is 
described in Evans et aL , Protoplasts Isolation and Culture, Handbook of Plant Cell 
Culture, pp. 124-176, Macmillilan Publishing Company, New York, 1983; and Binding, 
Regeneration of Plants, Plant Protoplasts, pp. 21-73, CRC Press, Boca Raton, 1985. 

20 Regeneration can also be obtained from plant callus, explants, organs, or parts thereof. 
Such regeneration techniques are described generally in Klee et aL Ann. Rev. of Plant 
Phys. 38:467-486 (1987). 

The methods of the present invention are particularly useful for 
incorporating the RG polynucleotides into transformed plants in ways and under 

25 cirormstances which are not found naturally. In particular, the RG polypeptides may be 
expressed at times or in quantities which are not characteristic of natural plants. 

One of skill will recognize that after the expression cassette is stably 
mcorporated in transgenic plants and confirmed to be operable, it can be introduced into 
other plants by sexual crossing. Any of a number of standard breeding techniques can be 

30 used, depending upon the species to be crossed. 
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The present invention farther provides methods for detecting RG resistance • 
genes in a nucleic acid sample suspected of comprising an RG resistance gene. The means 
by which the RG resistance gene is detected is not a critical aspect of the invention. For 
example, RG resistance genes can be detected by the presence of amplicons using RG 

5 resistance gene specific primers. Additionally, RG resistance genes can be detected by 
assaying for specific hybridization of an RG polynucleotide to an RG resistance gene. In 
some embodiments, the RG resistance gene can be amplified prior to the step of contacting 
the nucleic acid sample with the RG polynucleotide. 

In a typical detection method, the nucleic acid sample is contacted with an 

10 RG polynucleotide to form a hybridization complex. The hybridization complex may be 
detected directly (e.g., in Southern or northern blots), or indirectly (e.g., by subsequent 
primer extension during PCR amplification). The RG polynucleotide hybridizes under 
stringent conditions to an RG polynucleotide of the invention. Formation of the 
hybridization complex is directly or indirectly used to indicate the presence of the RG 

IS resistance gene in the nucleic acid sample. 

Detection of the hybridization complex can be achieved using any number of 
well known methods. For example, the nucleic acid sample, or a portion thereof, may be 
assayed by hybridization formats includmg but not limited to, solution phase, solid phase, 
mixed phase, or in situ hybridization assays. Briefly, in solution (or liquid) phase 

20 hybridizations, both the target nucleic acid and the probe or primer are free to interact in 
the reaction mixture. In solid phase hybridization assays, probes or primers are typically 
linked to a solid support where diey are available for hybridization with target nucleic in 
solution. In mixed phase, nucleic acid intermediates in solution hybridize to target nucleic 
acids in solution as well as to a nucleic acid linked to a solid support. In in situ 

25 hybridization, the target nucleic acid is liberated from its cellular surroundings in such as 
to be available for hybridization within the cell while preserving the cellular morphology 
for subsequent interpretation and analysis. The following articles provide an overview of 
the various hybridization assay formats: Singer et al, Biotechniques 4{3)\2?>Q-25Q (1986); 
Haase et al, Methods in Virology, Vol. VII, pp. 189-226 (1984); Wilkinson, "The theory 

30 and practice of in sim hybridization" In: In situ Hybridization, Ed. D.G. Wilkinson. IRL 
Press, Oxford University Press, Oxford; and Nucleic Acid Hybridization: A Practical 
Approach, Ed. Hames, B.D. and Higgins, S.J., IRL Press (1987). 
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The effect of the modification of RG gene expression can be measured by . 
detection of increases or decreases in niRNA levels using, for instance, Northern blots. In 
addition, the phenotypic effects of gene expression can be detected by measuring 
nematode, fungal, bacterial, viral, or other pest resistance in plants. Suitable assays for 

5 determining pest resistance are well known. Michelmore and Crute, Trans. Br. mycoL 
Soc, 79(3): 542-546 (1982). 

The means by which hybridization complexes are detected is not a critical 
aspect of the present invention and can be accomplished by any number of methods 
currently known or later developed. RG polynucleotides can be labeled by any one of 

10 several methods typically used to detect the presence of hybridized nucleic acids. One 
conmion method of detection is the use of autoradiography using probes labeled with^H, 
^"I, --S, ^^C, or ^^P, or the like. The choice of radioactive isotope depends on research 
preferences due to ease of synthesis, stability, and half lives of the selected isotopes. 
Other labels include ligands which bind to antibodies labeled with fluorophores, 

15 chemiluminescent agents, and enzymes. Alternatively, probes can be conjugated directly 
with labels such as fluorophores, chemiluminescent agents or enzymes. The choice of 
label depends on sensitivity required, ease of conjugation with the probe, stability 
requirements, and available instrumentation. Labeling the RG polynucleotide is readily 
achieved such as by the use of labeled PGR primers. 

20 The choice of label dictates the manner in which the label is bound to the 

probe. Radioactive probes are typically made using commercially available nucleotides 
containing the desired radioactive isotope. The radioactive nucleotides can be incorporated 
into probes, for example, by using DNA synthesizers, by nick translation with DNA 
polymerase I, by tailing radioactive DNA bases to the 3' end of probes with terminal 

25 deoxynucleotidyl transferase, by treating single-stranded M13 plasmids having specific 
inserts with the Klenow fragment of DNA polymerase in the presence of radioactive 
deoxynucleotides, dNTP, by transcribing from RNA templates using reverse transcriptase 
in the presence of radioactive deoxynucleotides, dNTP, or by transcribing RNA from 
vectors containing specific RNA viral promoters (e.g.. SP6 promoter) using the 

30 corresponding RNA polymerase (e.g., SP6 RNA polymerase) in the presence of 
radioactive ribonucleotides rNTP. 
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The probes can be labeled using radioactive nucleotides in which the isotope 
resides as a part of the nucleotide molecule, or in which the radioactive component is 
attached to the nucleotide via a terminal hydroxyl group that has been esterified to a 
radioactive component such as inorganic acids, e.g., 32P phosphate or 14C organic acids, 
or esterified to provide a linking group to the labeL Base analogs having nucleophilic 
linking groups, such as primary amino groups, can also be linked to a label. 

Non-radioactive probes are often labeled by indurect means. For example, a 
ligand molecule is covalently bound to the probe. The ligand then binds to an anti-Iigand 
molecule which is either inherently detectable or covalently bound to a detectable signal 
system, such as an enzyme, a fluorophore, or a chemiluminescent compound. Enzymes of 
interest as labels will primarily be hydrolases, such as phosphatases, esterases and glyco- 
sidases, or oxidoreductases, particularly peroxidases. Fluorescent compounds include 
fluorescein and its derivatives, rhodamine and its derivatives, dansyl, umbelliferone, etc. 
Chemiluminescers include luciferin, and 2,3-dihydrophthalazinediones, e.g., luminol. 
Ligands and anti-ligands may be varied widely. Where a ligand has a natural anti-ligand, 
namely ligands such as biotin, thyroxine, and Cortisol, it can be used in conjunction with 
its labeled, naturally occurring anti-ligands. Alternatively, any haptenic or antigenic 
compound can be used in combination with an antibody. 

Probes can also be labeled by dnect conjugation with a label. For example, 
cloned DNA probes have been coupled directly to horseradish peroxidase or alkaline 
phosphatase, (Renz. M., and Kurz, K. (1984) A Colorimetric Method for DNA 
Hybridization. NucL Acids Res. 12: 3435-3444) and synthetic oligonucleotides have been 
coupled directly with alkaline phosphatase (Jablonski, E., et al. (1986) Preparation of 
Oligodeoxynucleotide- Alkaline Phosphatase Conjugates and Their Use as Hybridization 
Probes. Nuc. Acids. Res. 14: 6115-6128; and Li P., et al. (1987) Enzyme-linked Synthetic 
Oligonucleotide probes: Non-Radioactive Detection of Enterotoxigenic Escherichia Coli 
in Faeca Specimens. Nucl. Acids Res. 15:5275-5287). 

Definitions 

Units, prefixes, and symbols can be denoted in their SI accepted form. 
Nim[ieric ranges are inclusive of the numbers defining the range. Unless otherwise 
indicated, nucleic acids are written left to right in 5' to 3* orientation, respectively. The 
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headings provided herein are not limitations of the various aspects or embodiments of the . 
invention which can be had by reference to the specification as a whole. Accordingly, the 
terms defined immediately below are more fully defined by reference to the specification as 
a whole. 

5 As used herein, the term "plant" includes reference to whole plants, plant 

organs (e.g., leaves, stems, roots, etc.), seeds and plant cells and progeny of same. The 
class of plants which can be used in the methods of the invention is generally as broad as 
the class of higher plants amenable to transformation techniques, including both 
monocotyledonous and dicotyledonous plants. 

10 As used herein, "pest" includes, but is not limited to, viruses, fungi, 

nematodes, insects, and bacteria. 

As used herein, "heterologous" is a nucleic acid that originates from a 
foreign species, or, if from the same species, is substantially modified from its original 
form. For example, a promoter operably linked to a heterologous structural gene is from a 

15 species different from that from which the structural gene was derived, or, if from the 
same species, one or both are substantially modified from their original form. 

As used herein, "RG gene," alternatively referred to as "RLG gene," is a 
gene encoding resistance to plant pests, such as viruses, fungi, nematodes, insects, and 
bacteria, and which hybridizes under stringent conditions and/or has at least 60% sequence 

20 identity at the deduced amino acid level to the exemplified sequences provided herein. RG 
genes encode "RG polypeptides," alternatively referred to as "RLG polypeptides," which 
can comprise LRR motifs and/or NBS motifs. The RG polypeptides encoded by RG genes 
have at least 55% or 60% sequence identity, typically at least 65% sequence identity, 
preferably at least 70% sequence identity, often at least 75% sequence identity, more 

25 preferably at least 80% sequence identity, and most preferably at least 90% sequence 

identity at the deduced amino acid level relative to the exemplary RG sequences provided 
herein. The term "RG family" or "RG family genus'* or "genus" includes reference to a 
group of RG polypeptide sequence species that have at least 60% amino acid sequence 
identity, and, the nucleic acids encoding these polypeptides. The individual species of a 

30 genus, i.e., the members of a family, typically are genetically mapped to the same locus. 

As used herein, "RG polynucleotide" includes reference to a contiguous 
sequence from an RG gene of at least 18, 20, 25, 30, 40, or 50 nucleotides in length, up to 
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at least about 100 or at least about 200 nucleotides in length. In some embodiments, the - 
polynucleotide is preferably at least 100 nucleotides in length, more preferably at least 200 
nucleotides in length, most preferably at least 500 nucleotides in length. Thus, RG 
polynucleotide may be a RG gene or a subsequence thereof. 

5 As used herein, "isolated," when referring to a molecule or composition, 

such as, for example, an RG polypeptide or nucleic acid, means that the molecule or 
composition is separated from at least one other compound, such as a protein, other nucleic 
acids {e,g,, RNAs), or other contaminants with which it is associated in vivo or in its 
naturally occurring state. Thus, an RG polypeptide or nucleic acid is considered isolated 

10 when it has been isolated from any other component with which it is naturally associated, 
e.g.. cell membrane, as in a cell extract. An isolated composition can, however, also be 
substantially pure. An isolated composition can be in a homogeneous state and can be in a 
dry or an aqueous solution. Purity and homogeneity can be determined, for example, 
using analytical chemistry techniques such as polyacrylamide gel electrophoresis (SDS- 

15 PAGE) or high performance liquid chromatography (HPLC). 

The term "nucleic acid" or **nucleic acid molecule" or "nucleic acid 
sequence" refers to a deoxyribonucleotide or ribonucleotide oligonucleotide in ei±er 
single- or double-stranded form. The term encompasses nucleic acids, i.e., 
oligonucleotides, containing known analogues of natural nucleotides which have similar or 

20 unproved binding properties, for the purposes desired, as the reference nucleic acid. The 
term also includes nucleic acids which are metabolized in a manner similar to naturally 
occurring nucleotides or at rates that are improved thereover for the purposes desired. The 
term also encompasses nucleic-acid-like structures with synthetic backbones. DNA 
backbone analogues provided by the invention include phosphodiester, phosphorothioate, 

25 phosphorodithioate, methylphosphonate, phosphoramidate, alkyl phosphotriester, 

sulfemate, 3*-thioacetal, methylene(methylunino), 3'-N-carbamate, morpholino carbamate, 
and peptide nucleic acids (PNAs); see Oligonucleotides and Analogues, a Practical 
Approach, edited by F. Eckstein, IRL Press at Oxford University Press (1991); Antisense 
Strategies, Annals of the New York Academy of Sciences, Volume 600, Eds. Baserga and 

30 Denhardt (NYAS 1992); Milligan (1993) J, Med. Chem. 36:1923-1937; Antisense 

Research and Applications (1993, CRC Press). PNAs contain non-ionic backbones, such 
as N-(2-aminoethyl) glycine units. Phosphorothioate linkages are described in WO 
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97/03211; WO 96/39154; Mata (1997) Toxicol Appl Pharmacol 144: 189-197. Other 
synthetic backbones encompasses by the terra include methyl-phosphonate linkages or 
alternating methylphosphonate and phosphodiester linkages (Strauss-Soukup (1997) 
Biochemistry 36:8692-8698), and benzylphosphonate linkages (Samstag (1996) Antisense 
5 Nucleic Acid Drug Dev 6: 153-156). The term nucleic acid is used interchangeably with 
gene, cDNA, mRNA, oligonucleotide primer, probe and amplification product. Unless 
otherwise indicated, a particular nucleic acid sequence includes the complementary 
sequence thereof. 

The term "exogenous nucleic acid" refers to a nucleic acid that has been 

10 isolated, synthesized, cloned, ligated, excised in conjunction with another nucleic acid, in 
a manner that is not found in nature, and/or introduced into and/or expressed in a cell or 
cellular environment other than or at levels or forms different than the cell or cellular 
environment in which said nucleic acid or protein is be found in nature. The term 
encompasses both nucleic acids originally obtained from a different organism or cell type 

15 than the cell type in which it is expressed, and also nucleic acids that are obtained from the 
same cell line as the cell line in which it is expressed, invention. 

The term "recombinant," when used with reference to a cell, or to the 
nucleic acid, protein or vector refers to a material, or a material corresponding to the 
natural or native form of the material, that has been modified by the introduction of a new 

20 moiety or alteration of an existing moiety, or is identical thereto but produced or derived 
from synthetic materials. For example, recombinant cells express genes that are not found 
within the native (non-recombinant) form of the cell or express native genes that are 
otherwise expressed at a different level, typically, under-expressed or not expressed at all. 
The term ""recombinant means** encompasses all means of expressing, /.e., transcription 

25 or translation of, an isolated and/or cloned nucleic acid in vitro or in vivo. For example, 
the term "recombinant means" encompasses techniques where a recombinant nucleic acid, 
such as a cDNA encoding a protein, is inserted into an expression vector, the vector is 
introduced into a cell and the cell expresses the protein. "Recombinant means" also 
encompass the ligation of nucleic acids having coding or promoter sequences from 

30 different sources into one vector for expression of a fusion protein, constimtive expression 
of a protein, or inducible expression of a protein, such as the plant disease resistant, or 
RG. polypeptides of the invention. 
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The term "specifically hybridizes" refers to a nucleic acid that hybridizes, . 
duplexes or bmds to a particular target DNA or RNA sequence. The target sequences can 
be present in a preparation of total cellular DNA or RNA. Proper annealing conditions 
depend, for example, upon a nucleic acid's, such as a probe's length, base composition, 

5 and the number of mismatches and their position on the probe, and can be readily 

determined empirically providing the appropriate reagents are available. For discussions 
of nucleic acid probe design and annealing conditions, see, e.g., Sambrook and Ausubel. 

The terms "stringent hybridization," "stringent conditions," or "specific 
hybridization conditions" refers to conditions under which an oligonucleotide (when uised, 

10 for example, as a probe or primer) will hybridize to its target subsequence, such as an RG 
nucleic acid in an expression vector of the invention but not to a non-RG sequence. 
Stringent conditions are sequence-dependent. Thus, in one set of stringent conditions an 
oligonucleotide probe will hybridize to only one specie of the genus of RG nucleic acids of 
the invention. In another set of stringent conditions (less stringent) an oligonucleotide 

15 probe will hybridize to all species of the invention's genus but not to non-RG nucleic 
acids. Longer sequences hybridize specifically at higher temperatures. Stringent 
conditions are selected to be about 5^C lower than the thermal melting point (TJ for the 
specific sequence at a defined ionic strength and pH. The T^^ is the temperature (under 
defined ionic strength, pH, and nucleic acid concentration) at which 50% of the probes 

20 complementarj' to the target sequence hybridize to the target sequence at equilibrium (if 
the target sequences are present in excess, at T„. 50% of the probes are occupied at 
equilibrium). Typically, stringent conditions will be those in which the salt concentration 
is less than about 1.0 M sodimn ion, i.e., about 0.01 to 1.0 M sodium ion concentration 
(or other salts) at pH 7.0 to 8.3 and the temperature is at least about 3(fc for short probes 

25 (e.g., 10 to 50 nucleotides) and at least about for long probes (e.g., greater than 50 
nucleotides). Stringent conditions may also be achieved with the addition of destabilizing 
agents such as formamide. Often, high stringency wash conditions preceded by low 
stringency wash conditions to remove background probe signal. An example of medium 
stringency wash conditions for a duplex of, e.g., more than 100 nucleotides, is Ix SSC at 

30 45^C for 15 minutes (see Sambrook for a description of SSC buffer). An example low 

stringency wash for a duplex of, e.g., more than 100 nucleotides, is 4-6x SSC at 4cPc for 
15 minutes, a signal to noise ratio of 2x (or higher) than that observed for an unrelated 
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probe in the particular hybridization assay indicates detection of a "specific hybridization." 
Nucleic acids which do not hybridize to each other under stringent conditions can still be 
substantially identical if the polypeptides which they encode are substantially identical. 
This can occurs, e,g, , when a nucleic acid is created that encodes for conservative 
5 substitutions. Stringent hybridization and stringent hybridization wash conditions are 
different under different environmental parameters, such as for Southern and Northern 
hybridizations. An extensive guide to the hybridization of nucleic acids is found in, e.g., 
Sambrook, Tijssen (1993) supra. 

As used herein "operably linked" includes reference to a functional linkage 

10 between a promoter and a second sequence, wherein the promoter sequence mitiates and 
mediates transcription of the DNA sequence conesponding to the second sequence. 
Generally, operably linked means that the nucleic acid sequences being linked are 
contiguous and, where necessary to join two protein coding regions, contiguous and in the 
same reading frame. ^ 

15 In the expression of transgenes one of skill will recognize that the inserted 

polynucleotide sequence need not be identical and may be "substantially identical" to a 
sequence of the gene from which it was derived. As explamed herein, these variants are 
specifically covered by this term. 

In the case where the inserted polynucleotide sequence is transcribed and 

20 translated to produce a functional RG polypeptide, one of skill will recognize that because 
of codon degeneracy, a number of polynucleotide sequences will encode the same 
polypeptide. These variants are specifically covered by the term "RG polynucleotide 
sequence". In addition, the term specifically includes those full length sequences 
substantially identical (determined as described herein) with an RG gene sequence which 

25 encode proteins that retain the function of the RG protein. Thus, in the case of RG genes 
disclosed here, the term includes variant polynucleotide sequences which have substantial 
identity with the sequences disclosed here and which encode proteins capable of conferring 
resistance to nematodes, bacteria, viruses, fiingi, insects or other pests on a transgenic 
plant comprising the sequence. 

30 Two polynucleotides or polypeptides are said to be "identical" if the 

sequence of nucleotides or amino acid residues, respectively, in the two sequences is the 
same when aligned for maximum correspondence, as described below. The term 
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"complementary to" is used herein to mean that the complementary sequence is identical to- 
all or a specified contiguous portion of a reference polynucleotide sequence. 

The terms "sequence identity," "sequence similarity" and "homology" refer 
to when two sequences, such as the nucleic acid and amino acid sequences or the 
polypeptides of the invention, when optimally aligned, as with, for example, the programs 
PILEUP, BLAST, GAP, FASTA or BESTFIT (see discussion, supra), "Percentage amino 
acid/nucleic acid sequence identity" refers to a comparison of the sequences of two 
polypeptides/nucleic acids which, when optimally aligned, have approximately the 
designated percentage of the same amino acids/nucleic acids, respectively. For example, 
"60% sequence identity" and "60% homology" refer to a comparison of the sequences of 
two RG nucleic acids or polypeptides which, when optimally aligned, have 60% identity. 
For example, in one embodiment, nucleic acids encoding RG polypeptides of the invention 
comprise a sequence with at least 50% nucleic acid sequence identity to SEQ ID N0:1. In 
other embodiments, the RG polypeptides of the invention are encoded by nucleic acids 
comprising a sequence with at least 50% sequence identity to SEQ ID N0:1, or, are 
encoded by nucleic acids comprising SEQ ID N0:1, or, have at least 60% amino acid 
sequence identity to the polypeptide of SEQ ID NO:2. 

"Percentage of sequence identity" is determined by comparing two optimally 
aligned sequences over a comparison window, wherein the portion of the polynucleotide 
sequence in the comparison window may comprise additions or deletions (i.e., gaps) as 
compared to the reference sequence (which does not comprise additions or deletions) for 
optimal alignment of the two sequences. The percentage is calculated by determining the 
number of positions at which the identical nucleic acid base or amino acid residue occurs 
in both sequences to yield the number of matched positions, dividing the number of 
matched positions by the total number of positions in the window of comparison and 
multiplying the result by 100 to yield the percentage of sequence identity. 

The term "substantial identity" of polynucleotide sequences means that a 
poljmucleotide comprises a sequence that has at least 55% or 60% sequence identity, 
generally at least 65%, preferably at least 70%, often at least 75%, more preferably at 
least 80% and most preferably at least 90%, compared to a reference sequence using the 
programs described above (preferably BESTFIT) using standard parameters. One of skill 
will recognize that these values can be appropriately adjusted to determine corresponding 
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identity of proteins encoded by two nucleotide sequences by taking into account codon 
degeneracy, amino acid similarity, reading frame positioning and the like. Substantial 
identity of ammo acid sequences for these purposes normally means sequence identity of at 
least 55% or 60%, preferably at least 70%, more preferably at least 80%, and most 

5 preferably at least 95%. Polypeptides having "sequence similarity" share sequences as 
noted above except that residue positions which are not identical may differ by 
conservative amino acid changes. Conservative amino acid substitutions refer to the 
interchangeability of residues having similar side chains. For example, a group of amino 
acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a 

10 group of amino acids having aliphatic-hydroxyl side chains is serine and threonine; a group 
of amino acids having amide-containing side chains is asparagine and glutamine; a group 
of amino acids having aromatic side chains is phenylalanine, tyrosine, and tryptophan; a 
group of amino acids having basic side chains is lysine, arginine, and histidine; and a 
group of amino acids having sulfur-containing side chains is cysteine and methionine. 

15 Preferred conservative amino acids substitution groups are: valine-leucine-isoleucine, 
phenylalanine-tyrosine, lysine-arginine, alanine-valine, and asparagine-glutamine. 

Another indication that nucleotide sequences are substantially identical is if 
two molecules hybridize to each other under appropriate conditions. Appropriate 
conditions can be high or low stringency and will be different in different circumstances. 

20 Generally, stringent conditions are selected to be about 5°C to about 20°C lower than the 
thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. 
The Tm is the temperature (under defined ionic strength and pH) at which 50% of the 
target sequence hybridizes to a perfectly matched probe. Typically, stringent wash 
conditions are those in which the salt concentration is about 0.02 molar at pH 7 and the 

25 temperature is at least about 5(fC. However, nucleic acids which do not hybridize to each 
other under stringent conditions are still substantially identical if the polypeptides which 
they encode are substantially identical. This may occur, e.g., when a copy of a nucleic 
acid is created using the maximum codon degeneracy permitted by the genetic code. 

Nucleic acids of the invention can be identified from a cDNA or genomic 

30 library prepared according to standard procediures and the nucleic acids disclosed here used 
as a probe. Thus, for example, stringent hybridization conditions will typically include at 
least one low stringency wash using 0.3 molar salt (e.g., 2X SSC) at 6^C. The washes 
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are preferably followed by one or more subsequent washes using 0.03 molar salt (e.g., 
0.2X SSC) at 50°C, usually 60**C, or mosre usually 65°C. Nucleic acid probes used to 
identify the nucleic acids are preferably at least 100 nucleotides in length. 

As used herein, "nucleotide binding site" or ''nucleotide binding domain" 
("NBS") includes reference to highly conserved nucleotide-, i.e., ATP/GTP-, binding 
domains, typically included in the "kinase domain" of kinase polypeptides, such as a 
kinase-la, kinase 2, or a kinase 3a motif, as described herein. For example, the tobacco N 
and Arabidopsis RPS2 genes, among several recently cloned disease-resistance genes, 
share highly conserved NBS sequence. Kinase NBS subdomains further consist of three 
subdomain motifs: the P-loop, kinase-2, and kinase-3a subdomains (Yu (1996) Proc. Acad, 
ScL i7S/4 93:11751-11756). As discussed in detail herein, examples include the 
Arabidopsis RPP5 gene (Parker (1997) supra), the A. thaliana RPS2 gene (Mindrinos 
(1997) supra), and the flax L6 rust resistance gene (Lawrence (1995) supra) which all 
encode proteins containing an NBS; and Mindrinos (1994) Cell 78:1089-1099; and Shen 
(1993) FEBS 335:380-385. Using the teachings disclosed and incorporated herein and 
standard nucleic acid hybridization and/or amplification techniques, one of skill can 
identify members having NBS domains, including any of the genus of NBS-containing 
plant disease resistant polypeptides of the invention. 

As used herein, "leucine rich region" (**LRR") includes reference to a 
region that has a leucine content of at least 20% leucine or isoleucine, or 30% of the 
aliphatic residues: leucine, isoleucine, methionine, valine, and phenylalanine, and arranged 
with approximate repeated periodicity. The length of the repeat may vary in length but is 
generally about 20 to 30 amino acids. An LRR-containing polypeptide typicially will have 
the canonical 24 amino acid leucine-rich repeat (LRR) sequence , which is present in 
different proteins that mediates molecular recognition and/or interaction processes; as 
described in Bent (1994) Science 265:1856-1860; Parker (1997) Plant Cell 9:879-894; 
Hong (1997) Plant Physiol. 113:1203-1212; Schmitz (1997) Nucleic Acids Res. 
25:756-763; Hipskind (1996) Mo/. Plant Microbe Interact, 9:819-825; Tomero (1996) 
Plant J. 10:315-330; Dixon (1996) Cell 84:451-459; Jones (1994) Science 266:789-793; 
LawTence (1995) Plant Cell 7:1195-1206; Song (1995) Science 270:1804-1806; as 
discussed in further detail supra. Using the teachings disclosed and incorporated herein 
and standard nucleic acid hybridization and/or amplification techniques, one of skill can 
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identify polypeptides having LRR domains, including any member of the genus of LRR- 

containing RG polypeptides of the invention. 

The tem "promoter" refers to a region or sequence determinants located 

upstream or downstream from the start of transcription and which are involved in 
5 recognition and bmding of RNA polymerase and other proteins to initiate transcription. A 

"plant promoter" is a promoter capable of initiating and/or regulatmg transcription in plant 

cells; see also discussion on plant promoters, supra. 

The term "constitutive promoter" refers to a promoter that initiates and 

helps control transcription in all tissues. Promoters that drive expression continuously 
10 under physiological conditions are referred to herein as "constitutive" promoters and are 

active under most environmental conditions and states of development or cell 

differentiation; see also detailed discussion, supra. 

The term "inducible promoter" refers to a promoter which directs 

transcription under the influence of changing environmental conditions or developmental 
15 conditions. Examples of environmental conditions that may effect transcription by 

inducible promoters include anaerobic conditions, elevated temperature, drought, or the 

presence of light. Such promoters are referred to herein as "inducible" promoters; see also 

detailed discussion, supra. 

The term "abscission-induced promoter" or "abcission promoter** refers to a 
20 class of promoters which are activated upon plant ripening, such as fruit ripening, and are 

especially usefiil incorporated in the expression systems (e.g., expression cassettes. 

vectors) of the invention. When the plant disease resistant polypeptide-encoding nucleic 

acid is under the control of an abcission promoter, rapid cell death, induced by expression 

of the invention's polypeptide, accelerates and/or accentuates abcission of the plant part, 
25 increasing the efficiency of the harvesting of fruits or other plant parts, such as cotton, and 

the like; see also detailed discussion, supra. 

The term "tissue-specific promoter" refers to a class of transcriptional 

control elements that are only active in particular cells or tissues. Examples of plant 

promoters under developmental control include promoters that initiate transcription only 
30 (or primarily only) in certain tissues, such as roots, leaves, fruit, ovules, seeds, pollen, 

pistols, or flowers; see also detailed discussion, supra. 



wo 98/30083 PCTAJS98/00615 

44 

As used herein "recombinant" includes reference to a cell, or nucleic acid, . 
or vector, that has been modified by the introduction of a heterologous nucleic acid or the 
alteration of a native nucleic acid to a form not native to that cell, or that the cell is derived 
from a cell so modified. Thus, for example, recombinant cells express genes that are not 
found within the native (non-recombinant) form of the cell or express native genes that are 
otherwise abnormally expressed, under expressed or not expressed at all. 

As used herem, a "recombinant expression cassette" or ''expression 
cassette" is a nucleic acid construct, generated recombinantly or synthetically, with a series 
of specified nucleic acid elements which permit transcription of a particular nucleic acid m 
a target cell. The expression vector can be part of a plasmid, virus, or nucleic acid 
fragment. Typically, the recombinant expression cassette portion of the expression vector 
includes a nucleic acid to be transcribed, and a promoter. 

As used herein, "transgenic plant" includes reference to a plant modified by 
introduction of a heterologous polynucleotide. Generally, the heterologous polynucleotide 
is an RG structural or regulatory gene or subsequences thereof. 

As used herein, "hybridization complex" includes reference to a duplex 
nucleic acid sequence formed by selective hybridization of two single-stranded nucleic 
acids with each other. 

As used herein, "amplified" includes reference to an increase in the molarity 
of a specified sequence. Amplification methods include the polymerase chain reaction 
(PGR), the ligase chain reaction (LCR), the transcription-based amplification system 
(TAS), the self-sustamed sequence replication system (SSR). A wide variety of cloning 
methods, host cells, and in vitro amplification methodologies are well-known to persons of 
skiU. 

As used herein, "nucleic acid sample" includes reference to a specimen 
suspected of comprising RG resistance genes. Such specimens are generally derived, 
durecily or indirectly, from lettuce tissue. 

The term "antibody" refers to a polypeptide substantially encoded by an 
immunoglobulin gene or immunoglobulin genes, or fragments or synthetic or recombinant 
analogues thereof which specifically bind and recognize analytes and antigens, such as a 
genus or subgenus of polypeptides of the invention, as described supra. 
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It is understood that the examples and embodiments described herein are for . 
illustrative purposes only and that various modifications or changes in light thereof will be 
suggested to persons skilled in the art and are to be included within the spirit and purview 
of this application and scope of the appended claims. 

5 

EXAMPLES 

The following examples are offered to illustrate, but not to limit the claimed 

invention. 

10 Example 1 describes the use of PCR to amplify RG genes from lettuce. 

Multiple prhners with low degeneracy, particularly at the 3' end. were 
designed based on the sequences of two known resistance genes from tobacco and flax. 
DNA Templates 

Lettuce genomic DNA was extracted from cultivar Diana and a mutant line 
15 derived from cultivar Diana using a standard CTAB protocol. To generate cDNA 
templates, RNA was isolated from cultivar Diana and the mutant following standard 
procedures; first strand cDNA was synthesized using Superscript reverse transcriptase 
from 1 Og total RNA as specified by the manufacturer (Life Technologies). BAC (bacterial 
artificial chromosome) clones from the Dm3 region were isolated from a BAC library of 
20 over 53,000 clones using marker AC15 that was known to be closely linked to Dm3. 
Bacterial plasmids containing clones of L6 and RPS2 were used as positive controls. 

PCR with degenerate oligonucleotide primers 

Oligonucleotide primers were designed based on conserved motifs in the 
25 nucloetide binding sites (NBS) of L6, RPS2, and N. Eight primers were made 

corresponding to the GVGKTT motif in the sense direction; each had 64-fold degeneracy. 
Six primers were made to the GLPLAL motif in the anti-sense direction; with either 16 or 
256-fold degeneracy (Table 1). 

Oligonucleotides included 14-mer adaptors of (CUA)4 at the 5' end of the 
30 sense primers and (CAU)4 at the 5' end of the antisense primers to allow rapid cloning of 
the PCR products into pAMPl (Life Technologies). 
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PGR amplification was performed in 50 Ol reaction volxraie with 1 OM of . 
each of a pair of sense and antisense primers. The templates were denatured by heating to 
94EC for 2 min. This was followed by 35 cycles of 30 sec at 94EC, 1 min at 50EC, 2 mm 
at 72EC, with a smgle final extension of 5 min at 72EC. 25 ng of genomic DNA or cDNA 
was used. BAG clones as templates requured less. The final dNTP concentration was 0.2 
mM; MgGlj was 1.5 mM. 

Forty-eight combinations of sense and antisense primers were tested on a 
panel of nine templates consisting of two genomic DNA samples, two cDNA preparations, 
three BAG clones and plasmids containing L6 and RPS2 as positive controls. 
Amplification from L6 and RPS2 resulted in fragments of 516 and 513 repectively . Seven 
combinations of primers resulted in fragments of approximately this size with multiple 
templates (Table 2). Primers that gave RLG products were: PLOOPAA, PLOOPAG, 
PLOOPGA, PLOOPGG, PLOOPAG, GLPL3, GLP14. 



(Intentionally left blank) 
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DEGENERATE PRIMER SEQUENCES for NBS PCR 

Sense primers based on GVGKTT amino acid sequence from L6, N and rps2 PLOOP 
motif. 

PLOOP AG 5* GGNGTNGGNAAAACGAC 3' 
PLOOP AA 5" GGNGTNGGNAAAACAAC 3' 
PLOOP AT 5* GGN GTN GGN AAA ACT AC 3* 
PLOOP AC 5' GGNGTNGGNAAAACGAC 3* 
PLOOPGG 5' GGNGTNGGNAAGACGAC 3' 
PLOOPGA 5* GGNGTNGGNAAGACAAC 3* 
PLOOPGT 5* GGN GTN GGN AAG ACT AC 3' 
PLOOPGG 5" GGN GTN GGN AAG ACC AC 3* 
Antisense primers based on GLPLAL amino add sequence: 
GLPLl 5' AGN GCN AGN GGN AGG CC 3' 

GLPL2 5' AGN GCN AGN GGN AGA CC 3' r 

GLPL3 5' AGN GCN AGN GGN AGT CC 3' ' 

GLPL4 5* AGNGCNAGNGGNAGCCC 3* 

GLPL5 f AANGCC AANGGCAAACC 3' 

GLPL6 5' AAN GCC AAN GGC AAT CC 3* 
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TABLE 2. Characteristics of RLGs isolated from lettuce. 







Template 


Primers 


Ktimber* 


Size** 
(bp) 


Copy 
number*^ 


Dm 

linkage 


5 




cDNA 

genomic D13A 
CDNA 


PIiOOFGA+GLFL 6 
PliOGPGA+GLPLS 
PL00PAA+6LPL6 
PLOOPAA+GLPI16 


e/s 

1/5 
5/5 
1/1 


522 




DM4, 
DH13 




RI.G2 


BACH8 


PL00PGG-I-GLPL3 


3/3 


510 




DM1, 
Dm3 




RL63 


gemonxc DNA 


PL00PGA-I-GLPL4 


3/6 


461 




DmS 
Dm8 


10 


RI.G4 


genomic DNA 


PL00PGA-i-GZ.PIi4 


1/6 


524 







■ Nixmber of RLG sequences out of total number of clones sequenced. 
^ Size of fragment amplified from the nucleotide bindind domain. 

Estimated copy- number from genomic Southern blot analysis and numbers of 
15 clones in the BAG library. 



Exaropte 2 

Example 2 describes the genetic analysis used to obtain a preliminary 
indication of the linkage relationships of the amplified products and known clusters of 

20 resistance genes. 

Bulked segregant analysis was performed to obtain a preliminary indication 
of the linkage relationships of the amplified products and known clusters of resistance 
genes. DNA from individuals were pooled for each susceptible and resistant bulk. 
Amplified products were then mapped by RFLP analysis from our intraspecific mapping 

25 population. Resistances from four clusters of resistance genes as well as over six hundred 
markers have now been mapped on this population. Linkage analysis was done using 
JIONMAP or MAPMAKER mapping programs. Due to a suppression of recombination in 
the Dm3 region, sequences were mapped relative to Dm3 using a panel of deletion mutants 
that provided greater genetic resolution than the mapping population (Anderson et ai 

30 1996 h All blots were washed twice at 63EC in 2x SSC/1 % SDS for 20 min, followed by 
one wash at 63EC in Ix SSC/0.1% SDS for 10 or 30 min. 
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Most of the RLG sequences were analyzed by bulked segregant analysis 
(BSA) using pools of resistant and susceptible individuals for each of the four clusters of 
resistance genes. In genomic Southern analyses, all the RLGs revealed numerous 
fragments of varying intensity. The numbers of bands was highly dependent of the 

5 stringency of hybridization. BSA demonstrated that RLGl was linked to the Dm4, 7 and 
Dml3 clusters. Segregation analysis confirmed this linkage. 

RLG2 was derived from BAG H8 that was known to be from the Dm3 
region. BSA with RLG2 demonstrated that the polymorphic bands that distinguished the 
parents of our mapping population mapped to the Dml,Dm3 cluster. Several bands 

10 absolutely cosegregated with Dml or Dm3. To provide finer genetic resolution, RLG2 
was also mapped using a panel of Dm J deletion mutants. A number of fragments were 
missing in largest deletion mutant demonstrating that several RLG2 family members are 
physically located very close to Dm3. No fragment was missing in all deletion mutants; 
however, this is not unexpected as there is extensive duplication within the region. 

15 

Ryampte 3 

Example 3 describes the screening of a bacterial artificial chromosome 

library. 

Over 53,000 BAG clones containing lettuce genomic DNA were screened 
20 with two of the amplified products. High density filters each containing 1536 clones were 
hybridized to ^"P labelled probes. Filters were washed at 65EG with 40 mM N%PO4/0. 1 % 
SDS for 5 min followed by 20 min in the same solution. 

To isolate additional RLG sequences we screened our genomic BAG library. 
Glones were identified that hybridized to RLGl and RLG2. Nearly all the clones that 
25 hybridized to RLG2 also hybridized to marker AC15 that had already been shown by 

deletion mutant analysis to be clustered around Dm3, This provided further evidence for 
clustering of RLG2 sequences . 

Using primers conserved within each family, part of the NBS was amplified 
from each unique BAG clone and sequenced. This revealed that members within each 
30 family varied from 64% identical at the deduced amino acid level. The most divergent 
members only weakly cross-hybridized to each other. Gurrently, RLG sequences are 
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considered to be part of the same family of sequences if they are at least 55 % identical at . 
the deduced amino acid level and map to the same region of the chromosome. 



Example 4: 

5 Example 4 describes the cloning, identification, sequencing and 

characterization of RG polynucleotide sequences; including use of RG sequences from 
plasmid and PGR products. 

Doubled stranded plasmid DNA clones and PGR products were sequenced 
using an ABI377 automated sequencer and fluorescently labelled di-deoxy terminators. 

10 Sequences were assembled using Sequencher (Genecodes), DNAStar (DNAStar) and 
Genetics Computer Group (GCG, Madison, WI) software. Database searches were 
performed using BLASTX and FASTA (GCG) algorithms. 

Sequences flanking the NBS region for RLG2 and for some of RLGl were 
obtained by a series of IPCR and the products sequenced directly. IPCR worked less well 

15 for RLGl. Therefore RLGl was subcloned from a BAC clone into pBSK (Stratagene) and 
the double stranded plasmid sequenced by long range sequencing. 

Initially, a total of 30 clones were sequenced. Three of these seven primer 
combinations yielded sequences that comprised continuous open reading frames with 
sequence identity to the NBS of known resistance genes. Seven out of 10 clones amplified 

20 from genomic DNA with the primer pair PL00PGA/GLP6 were 522 bp long; they were 
identical to each other and named RLGl. All six clones amplified from genomic DNA or 
cDNA usmg the primers PL00PAA/GLP6 were similar/the same as RLGl. All three 
clones sequenced from BAC clone H8 were 510 bp long, identical to each other but 
different from RLGl and were therefore designated RLG2. The 11 clones sequenced from 

25 four other primer combmations had no similarity to any NBS motifs and therefore were not 
studied further. Therefore, sequencing resulted in the identification of clones containing 
NBS motifs representing four RLG sequences. 

Comparison of the deduced amino acid sequences of RLGl and RLG2 to 
those of known resistance genes revealed that RLGl and RLG2 are as similar to each other 

30 as they are to resistance genes from other species and that this is the same level of identity 
sho>^Ti between the known resistance genes (Table 3). The percent identity (upper 
quadrant) and percent identity (lower quadrant) were determined using the MEGALIGN 
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routine of the DNASTAR package. Identity refers to the proportion of identical amino 
acids; identity refers to the proportion of identical and similar amino acids and takes into 
account substitutions of amino acids with similar chemical characteristics. RGl and RG2 
are as similar to each other and to cloned resistance genes as cloned resistance genes from 

S a variety of species are to each other. L6, resistance to Melampsora lini in flax (Lawrence 
et aL , 1995). resistance to tobacco mosaic virus in tobacco (Whitham et aL , 1994). 
PRF, required for resistance to Pseudomonas syringae in tomato. RPS2, resistance to 
Pseudomonas syringae in Arabidopsis thaliam (Bent et aL , 1994; Mindrinos et aL , 1994). 
RPMl, resistance to Pseudomonas syringae pv, maculicola mA. thaliana (Grant et al., 

10 1995). The initial RGl and RG2» sequences were amplified from lettuce using degenerate 
primers. 



15 



20 



Lettuce 
Lettuce 
Lettuce 
Lettuce 
Tobacco 



Table 3 
IDENTITIES OF 
RESISTANCE GENE HOMOLOGUES 





RGl 


RG2 


RG3 


RG4 


N gene 


RPS2 


RGl 


• ♦ ♦ 


22.7 


15.0 


29.2 


25.4 


23.8 


RG2 




* * » 


32.2 


21.6 


22.7 


33.0 


RG3 






♦ ♦ « 


17.2 


15.0 


32.8 


RG4 








« • « 


44.3 


22.7 


N gene 










* « «- 


21.6 


RPS2 












« « • 



25 

The regions homologous to the pruners are included in this analysis as the 
genomic sequences for RLGl and RLG2 were determined by IPCR. Interestingly, the 
genomic sequences for RLGl exactly matched that of the primers used. 

To obtain further evidence that we had amplified resistance genes, we 
30 amplified the regions flanking the NBSs of RLGla and RLG2a by IPCR of BAG clones. 
These products were then directly sequenced without cloning to minimize the introduction 
of PGR artifacts. Sequence analysis of the 5* regions failed to detect any homology to 
known resistance genes. However, the sequence of the 3* region contained leucine-rich 
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repeats (LRRs). When this sequence was used to search GENBANK using BLASTX, it 
detected identity to the Arabidopsis resistance gene, RPS2. This region does not contain as 
regular LRRs as in some resistance genes; however, the repeat structure seems to be 
consistent with that of the flax resistance gene, L6, Therefore, the presence of an LRR 

S region is further evidence that the sequences we amplified usmg degenerate oligonucleotide 
primers are probably resistance genes. 

The sequences of the IPCR products also provided the genomic sequences of 
the regions complementary to the sequences of the degenerate oligonucleotide primers. 
The genomic sequences for RLGl were identical to one of the primers in the mixture. 

10 The RLG sequences are resistance genes as supported by three criteria: the presence of 
multiple sequence motifs characteristic of resistance genes, genetic cosegregation with 
known resistance genes, and their existence as clustered multi-gene families. The presence 
of LRR regions in a similar position relative to the NBS as in cloned resistance genes 
provides stronger evidence than relying solely sequence similarity between NBS regions. 

15 The clustering of RLG sequences at the same position as the known clusters of resistance 
genes make them strong candidates for encoding resistance genes. The hybridization 
patterns and genetic distribution of the RLG sequences are similar to that of cloned 
resistance genes in other species. Most of these hybridize to small multigene families and 
preliminary genetic evidence indicates that they are clustered in the genome. Therefore, 

20 the degenerate primers that we designed from other resistance genes seemed to have been 
specific enough to amplify resistance genes rather than P-loop containing proteins in 
general. 

25 

(intentionally left blank) 



30 
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RLdA 



1 

81 



ATCGTAACCGTICGTAOSAG AK( 
AATTTTCTGGTTATnTAAA TE 




161 ATTT^AATGT3UlTAACfiATA AAT£»TRTTTRTTTTTCIT TRAATAAACGCMMAATAT ATRGATTAAAA'^CaTATAAT 
241 ACATAGGTIAASCTCATATA AXAPlTATGTTC a TCOCCAG aTmTmT AWlCT CA'ICC TTAATTTATTTATIATraT 
321 a T A TT A CSaCrftGATSATCrr TGTGATATTAAAAATmAT rCCTTCAAAATTTAAAATTA TTAATA ATCCCRCAA'mGA 
401 ATAAAATTAAAAAAAATGGN CCCACCATTAGTCCATCACT TXTICAGCrCATCAATATCG TaJCT Al'lVltr nU/riTC 
481 CaCCCTAATCaATATTTCCA QCGAATGACAGACTCCTACG GCC?ITICIGAATTrGCGrrC CXZACACTG TICAT TGAAGGA 
561 GATAATAAATCAAATGGAGC TGCTCCAATCTTICATTGCTG ATGAAAGGTaAAriCTATCTr GAAGANAATGTCAGCGATCN 
641 ATCTCCATCCGGAACCCACC ACATTATCAGTCTACCACCA AACCACTCAAAACGGYGGAA GTAGRRAKA a^mKAAAGTCA 
721 ItSAAGAATAGATrATnTrG TCCTCATOXXTOACTGAGG AGCGGCJl TrAGITCAT CATT rritrmUAMCAAAGAATTA 
801 TCGGTCCATCGAArmTAC ATCGACAAAGAAGTITCACT TCQCAAaUX'mUi 'fAAACA Ai-rX^ n-AA'ii :nTrrATCTr 
8B1 TTCGTTGAAACTCCTCAATT GCAACTTGCAACTrGCAACT TTTGGGCCrACAAATITCTG GTGGGCGTTAATTTAATCCA 
961 CATATTCACTGTAAACAATA ATOZAAATOSATCTCTGrrTC ATCXIAATTCATCAACATCTC TTGATAATT GAAATCATT CA 
1041 CGCTTCAraT^TITCATCCA CATCTATACTATATICTCTS CraTATCATATTAAACGAT GGCIUAAATCGTTCTriCTG 
1121, C Crrii:i ' :U ACA Lriu;i UT iT GAAAAGCTGGCATYTGAAGC CTrGAAGAAGAT TGTICGC T CCAAAAGAATTCAATCTGAC 
1201 CTIAAGAAATIGAAGGAGAC ATIMACCAAATCCAAGATC TC3CTTAACGATGCrTCC»G AAQGAAGtAACTAATOAAGC 
1281 CGTIAAAAGATGGCIGAATG ATCICCAACATTTDGCTrAT GACATAGAOCSACCTaCTTCA TG ATYTTGC AACTGA AGCIO 
1361 TTCAl'XCTGAGlTBACCGaG GAGGGTGGAG CX r r CC X C CA G T A T BG T A AGRAAACTAATCC CAAGTIGTTGCACAAGITIC 
1441 TCACAAAGTAAIAGGSATGCA TCCCAAGTIAGATSATATTG CCAaayGGTTACAAGAACTG GTAGAGGCAAAAAATAATCT 
1521 TOGTTTAAGTGTGATAACAT ATGAAAAGCCAAAAATTGAA AGGrrATGAGGCGTCTTTGGT AGATGAAAGC GglA CTGTCG 
1601 GACGTGAAGATGATAAGAAA AAATTGCTGGAGAAGCTGTT GGGGGATAAAGATGAATCAG GGAGTCAAAACTTCAGCATC 
1681 GTGCCCATAGTTGGTATGGG TGGAGTTCGTAAAACAACTC TAGCTAGACmTCTATQAT GAAAAGAAAGTGA AGGATC A 
1761 CTICGAACrCAGGGCrTGGG TITGTGTTTCTGATGAGTTC AGTGTTCCCAATATAAGCAG AGTTATrTATCAA TCriCTGA 
1B41 CTGGGGAAAAGAAGGAGTTT GAAGACTTAAATCTGCITCA AGAAGCTCTTAAAGAGAAAC TTAGGAAC CAG CTAa'nirrA 
1921 ATAGTTrrGGATGATGTGTG GTGTGAAAGCTATGGTGATr GGGAGAAATTA GTGGGCCCA xiwi-i \jCGGGU^- i< XTO3 
2001 AAGTAGAATAATCATGACAA CTCGGAAGGAGCAATTGCTC AGAAAGCTCCGCOTTTCTCA IXlM^GfiC^rcrCGAiSGarC 
2081 TATCACAAGATGATGCITTG TCTTrGTlTGCICAACACGC ATITGGTGTACCAAACTTTG ATTCA CATC CAACACTAAQS 
2161 CCACATGGAGAACIGTITGT GAAGAAATGTGATOGCTIAC CTCTAGC/TTAAGAACACrT GGAAGGIT ATIAAGG ACAAA 
2241 AACAGACGAGGAACAATGGA AGGAGCTGTT GgA TAglGAG ATATGGAQGTTAQGAAAGAG C GATG AGA'l'it:fiTi:u:;CCTC 
2321 TTAGACTAAGCTACAATGAT CmVmCa - . ' CrmXA AGCT R ' n ' K ' ri ' IUI ATA^'ltjClTZCr TGTITCCCAAGGACTATGAG 
2401 TTTGACAAGGAGGAGrrrGAT TCT A TTGT GG ATGGCAGAAG GGTTTTTGCACCAACCAACT AYAAACAAGTCAAAGCAAOG 
2481 K' n i XX TXXJ'l'l G AATA a ' ri ' ril AAGAGTTRTTGTCAAGRTCR TTITITCAACATGCTCCTAA TRRCAAATCSTrGTITGTGA 
2561 TGCATOACCTAATGAATGAT TItXXrrACAriTGTTGCTGG AGAATTrmTCAAGGTTAG ACATAGAGATG AAGAA GGAA 
2641 TITAGGATGr^AATCrrrGGA RAAGCACCQ'iCATATGTCAT TTGTATGTGAGRATTACATA GGTTACA AAARGTrC GAGCC 
2721 ATTTAGAGGAGCTAAAAA1T TGAGAACAITITIAGCATTG TCIGTTXSGGGTGGTA GAAG A TTOGAAGATGTITrACTrAT 
2801 CAAACAAGGTCTTGAATGAC V^n^ACrTCARGATTTACCATT GTTAAGGCTrCCTRAKTnGA TTRKTCT TAYAA TAASYRAG 
2881 GTACCAHAAICrCGTSGGTAG TATGAASCACTTGCGGTATC TTAATCTATCW GRAACm a. ATCftiQO VITI ACCXSGAAW A 
2961 TKTCTOCAATCTTTA'EAATT TACARACCCTGATTGTKICT GGC TGTG A^rrATmG TIAA KTllXX tJAARACC AUVxiAA 
3041 ASCTTAAAAATITGCASCAT TTIGACATGAGGGRTACrCC KAARI TRAA BAAC ATOCC Cr TA RGGMTQGTG ART TSAAA 
3121 ARTCTACAAACTCTCITO-SG TAACATTGGCATAGCAATAA CCGAGCrTAAGAACITGCMl AAYCTCCATGGGAAARTTXO 
3201 TATTGGCGGGCIOGGAAAAA TGGAAAATSCM GTKG GATGC ACXaTAAGCGAACTICTCTC AAAAAAGGTOiAATGARTTA 
3281 NAAACrGGro-7n%GGG0GTGA TPAATTTAATGTnTCOGAA ATOGGAACACITGAAAAAGA AGTCCT CAATG AAGTGATGC 
3361 CTCATAATGGTACTCTAI^ AAAACCCANAATTATGrCTA TAGGGGGTATAGAGmTCCA AATrGGGTTGGTITICACrAA 
3441 GGGTrrCTGAAACTAGAGAT GTGTTCATGGTGTATGAAAA AGAITrGTnTACGTAGTTTC ATCAATCACCAAGTGQGAAA 
3521 TAGATGATATTITCAGGGC/ TACTGATGAGATGTGGAGAG GTATGATAGGGTTTrCITGGG GCGGTAGAAGAAATA AGCAT 
3601 CCATTCrrTGTAATGAAATAA GATATVTGTQGGAATCAGAA GCAGAGGCAAGTAAGGTTCT TATGAATTTAAAGAAGTrGG 
3681 ATrTAGGTGAATGTGAAAAT TTGGTGACTTTAGGGGAGAA AAAGGAGGATAATCATAATA TTAATAGriGGGAGC AGCCTA 
3761 ACATCTnTAGGAGGTTGAA TGTATGGAGATGTAACAGCT TGGAGCATTGCAGGTOrCCA GATAGCaTCGAGAATTTGlA 
3841 TATGCACATGrrCTTGATTCAA TTttCATCCGTCTCCTICCCA ACAGGAGGAGGAC AGAA GAT CMfflCACTIACCATCACTG 
3921 ATIGCAAGAAGCTTTCGGAA GAGGAGXTGGGAGGACGAGA GAGGACAAGAgr GCTIA TAA ACICA AAAAT GCAGATGCIT 
4001 GAATCAGrTAGATATACGTAA TTGGCCAAATCXGAAAroiA TCAGIGAATIGAGTIGCTrC ATICACCTGAACAGATTATA 
4081 TATATCAAACTOTCCGAGTR TGCaGTCATTTCCIXSACCAT GACjTIGCCAAATCTCAOCTC CTTAACAGATCGAAQGAGAC 
4161 GACUGCSATTITCGTaCGAA axrXTACGATTCGACTQGCC GTOGTnT 
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[Strani] 

I AACXGTTCGT ACGAGAATCG CTCSTCCTCTC CTTCCTGTAA TATAATGATA AGAAAAAATA TGATTAAAQG 
71 TPTAAATCCA AAATCCATTA 1TCCACCGGT GATATGATGC ACTAGCTGTA GTATG CAAA A ACAGTATIAT 
141 AAATGCEAAC CAAAACAGCA GCTAAGAAAC AATATAAATA ATCCnTIGAA imilXT riXJ TCCGrrACA<7r 
211 CATTTCTTCC AAATCCCTAT CATTCATACA TACAAGTGCT CCCATATTAG GmTCACTA TAAGCAATQS 
281 CTGAAATCCT TGGTTCTGCG TTCrTTGCGG TGTTCTTTGA AAAGCITGCT TCTGAAGCCT TGAAGAGGGT 
351 TGCTTCCTCC AAAGrTAATTG ACAAGGAGCT CGAGAAATTG AATAGCTCAT GAATCAATAT AAAAGCTCTG 
421 CTCAATGATG CTTCICAGAA GGAAATAAGT AAGGAAGCTG TTAAAGAATG GTTGAATGCT CTTCAACATr 
491 TGCCTTACGA CATAGATGAT CTACTTGGCG ATTTGGCAAC CAAAGCTATC CATCGTAAGT TCTCTGAGGA 
561 ATACGGGGCC ACX^TCAACA AGGTACGAAA GTTAATTCCA lUlTOmCT CTAGITICTC AAGTACTAAG 
631 ATGCGCAACA AGATACATAA TATTACCAGC AAGTTACAAG AACIATCAGA AGftG ftGAAAT AATCXTGGAT 
701 TATGXGAAAT TGGIGAAAGC CGAAAACTTC GAAATAGAAA A1CAGAGACC TCmDQCTAG ATCCATCTAG 
771 TATTGnOGA CX3CACAQATG ATAAGGAAGC GnGCTTCTC AAG CTATA TG AAOC ATSTGA TAGAAACITT 
B41 AGCATCTTGC CNAIAGTIGG TATGGCTGGG TTAGATAAGA CCACTITAGG TAGACTTTTG TATGATNAAA 
911 TGCAAGTGAA GGATCACTTC GAACTCAAGG CGTGGGTITG TGTTTCTGAT GAGTl'lUATA TCTTCGGTAT 
981. AAGCAAAACC ATTTTCGAAT CGATAGAGGG GGGAAACCAA GAGTTTAAGG ATTTAAATCT GCTTCAGGTG 
1051 GCTTTAAAGG AGAAAATCTC AAAGAAACGA TnirnijTm TTCTTCATGA TGTATGGAGC GAGAGCTATA 
1121 CTCATIGGGA AATTCTAGAA CGTCCATTTC TAGCAGGAGC ACCAGGAAGT AAA GTAATCA TCACAACCCG 
1191 CAAGTICTCG TTGCTAAACC AATTGGGTCA TGATCAACCA TACCAATTGT CTGAnTGTC ACATGACAAT 
1261 GCTCTATCCT TATTITOrCA ACACGCATTT GGTGTAAATA GCTITGATTC ACATCCGATA CTTAAACCAC 
1331 ATGGTGAACSG TATTGTTGAA AAATGTGATG GTTTGCCATT GGCTTTGATT GCACTTGGGA GGTTATIGAG 
1401 GACAAAAAGA GATSAGGAAG AATGGAAGGA ACTATTGAAT AGTGAGATAT GGAGGTTAGG AAAGAGAGAT 
1471 GAGATTATTC CGGVTCTTAG ACTAAGCTAT AATGATCTTT CTGCCICTTT GAAGCAGTTG TTIGCATATT 
1541 GCTCC77GTT CCCCAAAGAC TAIX7IGTXCA ACAAGGAGAA GTXGATTTTA TTATG6ATGG CAGAAQGGIT 
1611 TTIGCACAAT GAAAATACAA ACAAGTCAAT GGAACGCTTA GNTCnGAAT ATTITGAOGA CTTGTTGTCA 
1681 AGGTCATTTT TTCAACATQC ACTC GATG AC AAATCGriTGT TTGT GG T G CA CGACXTTCATG AATGACTTOG 
1751 CXAC A TCTGT TGCTGGAGAT TMTiTlTAA GATTAGACAT TGAAATGAAA AAGGAAGCTT TGGAAAAATA 
1821 CCGACATATG TCATITCnT GTGAGAGTTA CATGGTTTAC AAAAGGITCG AACCATTTAA AGGAGCTAAA 
1891 AAATTGAGAA CTTTCTTAGC AATGCCrCTT GGGATQATAA AAAGTTGGAC AACATTTTAC TTATCAAATA 
1961 AGGTCCTIGA TGACTTACTT CACGAATTAC CATTGTTGAG AGTICTAAGT TTGAGTTATC TTAGCATCAA 
2031 GGAGGT7.CCT GAAATAATAG GCAATTTGAA ACACTIGCGG TATCTTAATT TATCACACAC GAGTATCACA 
2101 CATTTXCCAG AAAATGTCTG CAATCTTTAC AACTTACAAA CATTGATCCT TTGTGGCTGT TGmTATAA 
2171 CCAAGTrrCC CAACAACTTC TTAAAGCTTA GAAATTTACG GCATTTGGAC ATTAGCGATA CTCCCGGlTr 
2241 GAAGAAGATG TCCTCGGGGA TTGGTGAATT GAAGAACCTA CACACYCTCT CCAAGCTCAT TATTGGAGGT 
2311 GAAAA'TAGAC TAAACGAGCT TAAGAACTTA CAAAATCTCC ATG 
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1 TACTACTACT AGAATTCGGT GTTC3GTAAGA CGA<?rCTAGC TAGACTTTIG TATGAGGAAA TGCAAGGGAA 
71 GGATCACTTC GAACTTAAGG CGTGGGTATG TGTTTCTGAT GAGTITGATA TCTTCAATAT AAGCAAAATT 
14X ATCTTACAAT CGATAGGTGG TGGAAACCAA GAATTTACGG ACTTAAACCT GCTTCGAGTA GCTTTAAJWUG 
211 AGAAGATc^rC AAAGAAAAGa TTTCTTCTTG TICTTGATGA TGTTTGGAGT GAAAGCTATA CCGATIGGC5A 
281 AATItTTAGAA CGCCCATTTC TTGCAGGGGC ACCTGGAAGT AAGATTATTA TCACCACCCG GAAGCTGTCA 
351 TTGTTAAACA AACTCGGTTA CAATCAACCT TACAACCTTT CGGTTTIGTC ACATGAGAAT GCTTTGTCTT 
421 TATTCTCTCA GCATCCATTO GGTGAAGATA ACTTCAATTC ACATCCAACA CTTAAACCAC ATCGCGnAQS 
491 TATXtmCAA AAATGTGATG GeOTGCXIATT GGCATTGTOG ACAT6ATGAT GATO 
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1 TCCCGTGCAA CGTCTATCAT TC:AGAAG^EC CCAAAGACCA. I-aGAHTIGTr TAANGNfTGNT lOTCAGAAQS 

71 AAGIS^TTGA TGAAGCTCmf AAAAGAITCGC TGATTOATNT CCAACAATTQ GCTTACGACA CTSANGAQCl 

141 ACnG^^TGIAT NICGCAACAG AAGCTATICA TCGTGAGITG ATCCGTGAAA CTGGAGCTrC a^CCAGCATG 

211 CIA.AGAAAGC TAATCCOVAG TTGTIGCAC\ AGTTTCICAC AAAGIAATAQ GATGCATCCC AGGTIAGATG 

281 ATATTQCCGC TAAGIT^ACAA GAACTQGTAG AGGOS^ftAAA TAATCTIGGT TTAR tflU T G A TAACAIAOQI^ 

351 AAAACGCAAA ATTSAAAGAG AIGAGGGGDN TrTGGTAGAT GCAA8TCGTA TG AT TBGACG TGAAGATGAT 

421 AAGAAAAAAT IGCTTCAGAA GCIGTTGGGG GAIACTXATS AATCAAGTAG TCAAAACTTC AACATCGIGC 

491 CCAlMrrGG TATGGC3TGGG GTaOGTAAAA CAACTCTAGC TAGACTTTTG TATGATGAAA AAAAAGTCAA 

561 GGATCACTTC GAACTCAGGG TTTGGGTTTG TGTTICTGAT GAGTTCAGTG TICCCAATAT AAGCAGAGTT 

631 ATCT^TCAAT CTGTGACTCG TGAAAACAAA GAATTTCCAG ATTTAAATCT GCTTCAAGAA GCCCTTAAAG 

701 AQAAACTTCA GAACAAACTA TTTCTAATAG TTTTAGATGA TGTATGGTCT GAAAGCTATG GTGAITGGGA 

771 GAAATTAGTG GGCCCATITC ATGCTGGGAC TTCTGGAAGT AGAATAATCA TGACTACTCG GAAGGAGCAA 

841 TTACItMAC AGCTGGGTTT TTCTCATGAA GACCCTCTGC ATAGTATAGA CTCCCTCCAA CGTCTATCAC 

911 Ai^GAAGATGC TTTGTCTITG TTITCTCAAC ACGCATTTGG TGTACCTAAC TTTGATTCAC ATCCAACACT 

981 AAGGCCATAT GGGa^^ACAGT TTGTGAAAAA ATGTGGGGGA TTGCCTTTGG CLTIVJl' 
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1 QTlACCTrrC 

71 TGAAGCCGTT 

141 CTTGCTJUCAS 

m CTAATCCCftA 

281 CCAUCriTAOV 

351 AATTQAAAGG 

421. TTGATG3AGA 

491 T AT GGCTGGA 

561 GAACTCAGGG 

631 CTC3TGACCGG 

701 AAACAAACTA 

771 GGCCCVmC 

841 AGTTGGGTTT 

911 rrm ' rwVriu 



TACGAGATCG 
AAAAGATGGC 
AAAGCTATTC 
G - n tS T T G CAC 
AGAACTGGTA 
TATGAGGCAT 
AGCTGTZGGA 
GTTGGCNAAA 
CrrGGGTlTG 
GGAAAAGAAA 
TITCTAATAG 
ATGCTGGGAC 
TTCTCATCAA 
TTTGCTCAAC 



CTGTCCCTCC 
TGAATGATCT 
tJTCSTCAGTT 
AAGTTTCTCA 
GftGGC AAAAA 
CTITGGTAGA 
GGATAAAGAT 
CAACTCTPJGC 
TGTTTCTGAT 
GAGTTTGAAG 
TTTTGGATGA 
TTCTGGAAGT 

gacxx:tctgc 
acgcatttgg 



TCGATCTGCT 
CCAACATTTG 
GACCGANGAA 
CAAAGTTATA 
ATAATCTTGG 
OGAAAGTIGGT 
GAATCCGGAG 
TAGACTCTTG 
GAATTCAGTA 
ACTTAAATCT 
TGTATGCTTCG 
AGAATAATCA 
GTTGTATAGA 
TGWCCA 



TAACGATGCT 
GCTTATGACA 
GGTGGAGCCr 
GGATG CATGC 
TTTAAGTGTG 
ATITTtGGAC 
TCHAAALTIXJ 
TTTGATGAAA 
TTCTCAACAT 
GCTTCAAGAA 
GAAAGCTATG 
TGACTACTCG 
CrCCCTGCAA 



TCCCAGAAGG 
TANACGACCT 
CCACCACnAT 
CAAGTXAGAT 
ATAACATATG 
GTItlAGATGA 
AGCATCCXGC 
AGACAGTZGAA 
A AGCAAA GIT 
GCTCTEAGAG 
GTGATTGGGA 
G AAGGA GCAA 
CGTCTATCAC 



AAGTTiACTAA 
ACTTGATGAT 

ggtaagaa;^ 

GATATTGCCA 
AAAAGCCCAA 
TNAGAAAAAA 
CCATAATTQU 
GGATCACnC 
ATCTATCAAT 
GGAAACTACA 
GAAATTAGTG 
TTACTCAAAC 
AAGATGATGC 
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RLGIE 
[Stxand] 



1 TCTAGCTAGA CTTTTGTATG ACGAGATGCA AGAGAAGGAT CACTTCGAAC TCAAGGCGTG GGTrTGTGTT 

71 TCTGATGAGT TTGATATATT CAATATAAGC AAAATTATTT TCCAATCGAT AGGAGGTGGA AACX3VAGAAT 

141 TTAAGGACTT AAATCTCCTT CAAGTAGCTG TAAAAGAGAA GATTTCAAAG AAACGATTTC TACTTGTTTr 

211 TGATGATGTT TGGAGTGAAA GCTATGCGGA TTQOGAAATT CTGGAACXSCC CATTTCTTGC AGGGGCAGOC 

2B1 GGAAGTAAAA TTATCATGAC GACCCGGAAG CAGICATT GC TAACCAAACT CG glTACRAG CAACCITACA 

351 ACCmCCGT TTTGrCACAT GACMTGCTC TCTCTTTRTT CTGTCAGCAT GCATTGGGT6 AAGATAACTT 

421 CGATZCACAT CCAACACTTA AACCACA1X3G GGAAGGCATT GTTSAAAAAT GTQCT 
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[Strand] 



1 ATTTra^GCT Q2AAACAAAN AAAAGCAATG GCTGAAATCT TTCTTTCNGC ATTCTAGACC AGTATTCTTT 

71 GAAAAGNTGG CTTCTGAAGC CTTGAAGAAG ATCGCTCGCT TCCATCGGAT TGATTCTGAG CTCAAGAAAC 

141 TGAAGAGGTC ATTAATCCAG ATCAGATCTC TGCTTAATGA TGCTTCTGAG AAGGAAATAA GTGATGAAQC 

2X1 TGTTAAAGAA TQGCTGAATC GTCTCCAACA TnCTCTTAC GACATAGACG ACCTACTTCA TGATTTGGCA 

2B1 ACOGAAACTA TC3CATCGTGA GTIGACCCAC GGATCTGGAG CCTCXACCAG CTTGTAAGAA AGATAATOCX: 

351 AACTTGTtGC ACAGATTTCT CACTAAGTAG TAAGATGCGT AACAAGITAG ATAATATTAC CATCAAGITA 

421 CAAGAACTGG TAG AGGAAAA AGATAATCTT GGCITAAGrrS TGAAAGSIGA AAGOOCAAAA CATACCAACA 

491 GAAGATTACA GAOCTCTITG GTAGATGCAT CTAGCATTAT TGCSTCGT S AA GGIGATAAGG ATGCATTGCT 

561 CCM5UGCTG CTGGAGGATG AACCAAGTGA TAGAAACTTP AGCATCGTCC CAATAGITOG TATGGGTOGT 

631 GTGGGTAAGA CGACTCTAQC TAGACTTTTG TATGACGAGA TGCAAGAGAA GGATCACTTC GAACTCAAGG 

701 CGTGGGTTTG TGTTTCTGAT GAGTTTGATA TCTTCAATAT AAGCAAAGTT ATCTTOCAAT CGATAGGTGG 

771 TGGARA CCAA GAATTTAAGG ACTTAA ATCT CCTTCAAGTA GCTGTAAAAG AGAAGATTTC AAAGAAACGA 

841 TTTCT.7YTTG TTCTGGATGA TGTITGGAGT GAAAGCTATA CAGAATGGGA AATTCTAGCA CGTCCATTTC 

911 TTGCAGGGGC ACCAGGAAGT AAaATTATCA TGACGACCCG GAAGTICTCG TTGCTAACCA AACTCGGTTA 

981. CAATCAACCT TACAACCTTT CSGTTTTGTC ACATGATAAT GCTVTGTCTT TATTCTGTCA GCAYGCATTG 

1051 GGTGA AGATA A CTTCGA TTC ACATCCAACA CTTAAACCAC ASGGTGAAAG TATTGriTGAA AAATGTGACG 

1121 G Tri'ACC ATT GGCrTTRATT GCACTTGGGA GRTTGTTGAR GACAAAAACA GATGAGGAAG AATOSAARGA 

1191 AGTGCTGAAT AGTGAAATAT GGGGGTCAGG AAAGGGAGAT GAGATTGTTC CGGCTCTTAA ACTAAGCTAC 

1261 AATGATCTCT CTGCXrii;rrT GAAGAA GTIG TTTGCATACT GCTCCTroiT CCCAAAAGAC TATGTGTTCG 

1331 ATAAGGAGGA GTIGATTTTG TTOrGGATCG CAGAAGGGTT TTTGCACCAA TCAACCACAA GCAAGTCBAT 

1401 GGAACGCTTG OGKCATGAAG GTTTTGATGA A.TTGTTGTCA AGATCATTTT TTCAACATGC CCCTGATQOC 

1471 AAATCGATGT TTGTGATGCA TGAOCTGATG AATGACITGG CHACATCTGT TGCTGGAGAT TTTTTTTCAA 

1541 GGATGGACAT TGAGATGAAG AARGAATTTA GGAAGGAAGC TTTGSAAAAG YAYCGCCATA TOTCAOTICT 

1611 TTGTUAKGAT TACATGGTKi: ACAAAAGGTT CRAGCCATTS ACAAGGAGCT AG 
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RLGIG 
[Strard] 

1 GTGAAGGATC ACTTCGAACT CAGGGCTTGG GTTrGTGTTT CTGATGAATT TAATATCCTC AATATAAGCA 
71 AAGTAATTTA TCAATCTGTA ACCGGGGAAA AAAAGGAGTT TGAAGACTTA AATCTGCTTC AAGAAGCTCT 
141 TAAAGAAAAA CTITGGAATC AGTTATTTCT AATAGTTCTG GATGATGTGT GGTCTGAAAG CTATCGTGi^T 
211 TGGGAGAAAT TAGTGGGCCC ATTTTTTTCG GGGTCTCCTG GAAGTATGAT TATCATGACA ACTCGGAAQG 
281 AGCAATTGCC AAGAAAGCTG GGmTCCTC ATCAAGACCC TTTGCAAGGT CTATCACATG ACGATGCTIT 

351 c» - ii:: ' : ' :i» ' riT gcicaacacg cattiggtot acca 



wo 98/30083 



61 



PCTAJS98/00615 



RLiHH 
[Stzaixil 



I TCERG CTftGft CTmGTATG AGGAAATGCA AGGGAAGGAT CACTTCGAAC TCAAGGCGTG GGTATGTGTT 

71 TCTGATGAGT TTSATAICTT CAATATAAGC AAAATTATCT TACAATOGAT AGGTGGTGGA AACCAAGAAT 

141 TIACG GACTT AAACCIGCTT CAAGTAGCTT TAAAAGAGAA GATCTCAAAG AAAAGATXTC TlUi'lU'l' i V r 

211 TGATGnlGTT IGGAGTGAAA GCTATACCGA TTOGGAAATT CIAGAAOGCC CATTICrXCC AGGQGCACCT 

281 G GAAGTftAGA TTATTRT CAC CACOOGGAAG CICTCATTGr TAAACAAACT CGGTIACAAT CAACX:TTACA 

351 ACCTTTCGGT TTTCICACAT GAGAATGCTT TGTCTTTATr CICTCAGCAT GCATTGGGTG AAGATAACTT 

421 CAATTCACAT CCAACACTTA AACCACAOGG OGAAGGTATT GTTGAAAAAT GTGAT 
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Rnai 

[Strand] 

1 TCTMCTAGA CTTGTGTATG ATGAGATGCA AC3AGAAGGAT CACTTTGAAC TCAAGGCGTC GGTATGTGTT 
71 TCTGATGAGT TTGATATATT CV^TATAAGC AAAA1TATTT TCCAATCGAT AGGAGGTX3GA AACCAAGAAT 
141 TTAAOSACTr AAACCTCCTT CAAGTAGCTG TAAAAGAGAA GATTTTAAAG AAACGATTTC 'i'lClTOriVr 
211 TGACGACGTT TGGAGTQAAA GCTATGCCa^ TTGGGAAATT I3TGGAACGCC CATTTCTTGC AGGGGCAGCC 
281 GGAA3TAAAA 1TATCATGAC AACCCGAAAG CAGTCATTGC TAACCAAACT CGGTTACAAG CAACCTTACA 

351 Accrrrccx?r ttictcacat a^CAGTGcrc tgtctttatt ctgtcagcat gcattgggtg aaggtaactt 

421 CGAT^CACAT CCAACACTTA AACCACATGG CGAAGGCAOT GTTGAAAAAT GTGCTGGATT GCCATIGGCA 
491 TTGTCGAC& 



wo 98/30083 



63 



PCT/US98/00615 



RDoU 



1 TfiCtMrr^srr MSPATrcccrr g tto g taaga. cgac;ictagc TAGAcrmG tatgag gaaa tgcaaggga^v 

71 GGATCnCTTC GAACTXAAOG COIGGGTATG T6TTTCTGAT GAGTTTGATA TCTTCAATAT AAGCAAAATT 

141 ATCrCACAAT CGATAGGTG6 TOaAAACCAA GAATTTACXSG ACTIAAACCT GCTXCGAGTA GCTTTAAA^ 

211 AGAAGATcfTC AAAGAAAAGa ' m C I T CTXtS TICTTQATGA TGTTIGCAGT GAAAGCTATA CCGATTOGGA 

281 AATrrTTAGAA CGCCCATTTC TTGCAGGGGC ACCTG6AAGT AAGATTATTA TCACCACCCG GAAGCTOICA 

351 TTOITAAACA AACTCGGTTA CAATCAACCT TACAACCTTT CGGTTTTGTC ACATGAGAAT GCTTIGTCTT 

421 TATTCTGTCA GCATGCATTG GGTGAAGATA ACTTCAATTC ACATCCAACA CTTAAACXaC ATOGCGfAQCS 

491 TATTCTTGAA AAATGTGATG GaCnxXTATT GCXIATTGTCG ACATGATGAT GATG 
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IVTVRTR?LSLLHLLSYVlFS?l?PH?ILWLF.lNFYSTCHFMSFSILLSFT.YLNVmNAYLFFi=K.THIIYR 

LKSYNT.VKLI.YICSSFVYLYVSSL1YLLFIY.SR.SLY.KFNLFKI.NY..SHNLNKIKKNGPTISFSLFQUN1V 

SltJJlRHPNQYFQRNm)SYGVSEFAFRHCSLKEIINQMELLQCSlJLJi4KGELYVK?MSAl?mPEPTTLS 

YHQTTQNGGSR?T?KS.R1DYFCPHGLTEERV.R1FL?KNYRSIEFLHRQRSFTSQCFVKQFUFLSFR.NS 

SlATCNLGLLGPQlCGGaFNPHIHCKQ.FKSlSVHPIHQHLLllEIIHASSlSSTSILYSLLLSY.TMAEIVLS 

AFLWVFE<U?EAU<KIVRSKRIESELKKLKETLDQlQDLLNDASQKEVTNEAVKRWmDLQHLAYDID 

DLLDD?ATEAV?RELTcEGGASSSMVRKUPSCCTSFSQSNRMHAKLDDIATRLQELVEAKNNLGI_SVl 

TYEKPKlEHYEASLVDESGWGRBDDKKKLi£KLLGDKDESGSQNFSlVPIVGMGGVGmiJ\^ 

KVKDHFELRAWVWSDEFSVPNISRVIYQS\n-GEKKe=EDmi±QEALKEKmNQlJ^ 

GDWEKLVGPFUS.GSFGSRIIK/rrTRKEQllRKLGFSHQDPliGL5QDDALSlJ=AQHAFGVPNFDSHFr^ 

PHGELFVKKCDGLPLALRTLGRl±RTKTDEEQWKELmSEIWRLGKSDElVPAmi^YNDLSA?LK^ 

YCSLFPKDYEF[KEEL:ilWIWl^EGFLmFr?NKSKCMjGLEYF?ELI^SFFC5^ 

U^TFVAGErFSRLJDIB;1KKEFRM?SLEKHRHMSFVCE?YlGYK?FEPFRGAKNLKrRj\LSVGW 

MFYLSNKVLND?LQDLPLLRVL?LI?L?I??VP??VGSM?HLRYLNLS?T?ITHLPE??CNLYNLQTLIV 

SGC?YLV?LPICn=S?LKNL7HFDMR?TP?lJ<NMPL?IGELK?LQTLF?NIGIAITELKNL?NLHGK?CIGG 

LGKMENAVGCTLSELVSKKV7.77NW77G..I.GFPKWEHLKKKSSMK.CLIMVL7KKP7IMS1GGIEFPN 

WVGSmVSETBDVR.4VYEK7CFT.FHQSPSGICMIFSG?TDEMWRGMIG?LGAVEEISIHSCNElRYLWE 

SEAEASKVIMNLKKLDLGECENLVSLGEKKEDNHNINSGSSLTSFTIRLNV^ 

MHMCDS7TSVSFPTGGGQKIKSLTITDCKKL5EEELGGRERTRVUNSKMQMLESVDIRNWPNLKSISEL 
SCF1HLNRLYISNCPS7ESFPDHELPNLTSLTDRRRGQRFSYERLRFDWPSF 
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NRSYENRCPLiPVI...EKl.LKV.IQNPLFHR.YDALAWCKNSIINANQNSS.ETl.lMV.IVLSPYTHFFQIPII 

HTyKCSHIRFSlJUVIABLGSAFFAVFFEKlASEALKRVACSKVIDKEL£KU^SSJNIKALUJDASQra 

KEAVKEWLNALQHLPYDIDDLLGDLATKAIHRKFSEEYGATINKVRKLIPSCFSSLSSTKMRNKIHNITS 

KLQEU^ERNNLGLLElGESRKLflNRKSETS?LDPSSIVGRTDDKEALLJJ<LYEPCDRNFSILPIVi^ 

DKrTLGRLLYD?MQVKDHFELKAVWCVSDEFDlFGISKnFESEGGNQEFKDLNLLQVALKE^ 

WmDVWSESYTOWEILERPFLAGAPGSIWirrTRKLSLmQLjGHDQPYQLSDLSH 

VNSFDSHPIIJ<PHGEGlVB<CDGLPUyjALGRLlJ^TKRDEEEWKELL^SEIW^ 

LSASLKQlJ=AYCSLFPKDYWNKEmLlWhMEGFLHNENTNKSMERL?L£YRDD 

IJ=WHDmNDU^TSVAGDYFmLDIEMKKEAL£KYRHMSFVCESYMVYKRFEPR<GAKI^^ 

GMIKSWTTPrt^NKVmDUJ4ElPLJJ1VLSL5YLSIKEVPaiGNLKHLJ^YIJNlL5OT^^ 

LCmJLCGGCFrrKFPNNFLKLRNLJWmlSDTPGLKKMSSGIGELKNlJ^Tl^ 

H 
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SRAT?1IQK?PKT?D?F????QKEVIDEAVKRWL1D?QQLAYDT7D?LDD?ATEAIHREURETGAS?S 

MVRKLIPSCCTSFSQSNRMHARLDDIAAK?QELVEAKNNLGLSV!TYEKPKIERDEA?LVDASGIIGRED 

DKKK11.QK1±GDTYESSSQNFNIVPIVGMGGVGKTTLAR1±YDEKKVKDHFELBV\^ 

RVIYQS\n'GENKEFADU^lXQEALKBaQNKLmVLDD\WSESYGDWEKLVGPmAGTSGSRIlN/r^ 

KECHlKQLQFSHEDPWSIDSljQRLSQE3AlSLFSQHAreWNFDSHPTU^PYGEQFVKKCGGLPl^ 
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?T?LflDRCPSSICLTMLPRRK?LMKPlJ<DG.MISNIWLMT?TTYLMILQ?KAI??ELT?EGGASTSMVRK 

LIPSCCTSFSQSYRMHAKLDDIATRLQELVEAKNNLGLSVrrYEKPKIERYEASLVDESGlFGR7DD?KK 

LWEKUXDKDESGmQHLPilGMGGVGTTTU^RUJ^DElCWKDHFELRAWVCVSDEFSILJ^ISW 

Nn'GEKKEFEDUsIli.QEAmGKLQNKLFUNAJDDVWSESYGDWEKLVGPFHAGTSGSRIlK^^ 

QLGFSHQDPLRCIDSLQRLSQDDALSLFAQHAFG? 




< 
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LAJ^l±YDB/QEKDHFElJ<AWVCVSDEFDlFNISKliraSIGGGNQEFKDmLLQVA\fl<EKlSKKRFL^ 

D\AWSESYADWEILERPRAGMGSKIIMTTRKQSI±mGYKQPYNLSVLSHDSALSI^CX3H^ 

DSHPTLKPHGEGIVEKCA 
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FSA?NK?KQW1J<SIT?HSRPVFFEK?ASEALKKIARFHRIDSELKKLKRSUQ1RSVLNDASEKEISDEA 

VKEWmGLQHLSYDIDDLmDlATETT*flHRELTroi£PPPACKKDNPTCCTDFSLSS^^ 

QELVEEKDNLGLSVKGESPKffTNRRLQTSLVDASSIIGREGDKDALJL^iKLLEDEPSDRNFSIVPIVGMGG 

VGKTTLAFU±YDEMQB<DHFEIJ<AWVCVSDEFDIFNISKVIFQSIGGG?QEFKDIJv|LLQVAVK^ 

FL?VmD^AWSESYreWEIURPFU^GAPGSKIl^mRKLSLLmGYNQPYNlSVLSHDNAl5IJ^^ 

LGEDNFDSHPTlJ<P?GESIVEKCDGLPL^LlALGRl±?TmEEEVVKEVLJ^SEIWGSGKGD 

YNDLSASLKKIJ=AYCSLFPKDWFDKEEUaW^MEGFU^QSTTSKSMERlJ3HEGro 

AKSMFVMHDLMNDLATSVAGDFFSRMD1EMKKEFRKEAL7K7RHMS7VCTDYMWKRFPP7TRS. 
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VKDHreiJ^WVCVSDEFNILNISKVIYQSWGEKKEFEDUvllXQEALXEKLWNQIJTJV^ 
DWEKLVGPFT^SGSPGSMIIIOTTIKEQLPRKLGFPHQDPLC3GLSHDDALSLPAQHAFQVP 
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LARLLYEEMQGKDHFELKAVWCVSDEFDIFNISKIILQSIGGGNQEFTDLNLLQVALKEKISKKRFLLVLD 

D^MSESYroweL£RPFUGAPGSKIIITTRKLSLL^IKLGYNQPYNLS\fl^HE^WLSLFCQHA^ 

SHPTIKPHGEGIVEKCD 

5^© :s^Pi\)0'i^ 
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LARLWDEMQEKDHFeJ<AVWCVSDEramiSKIIFQSIGGGNQEFKDIJ4LLQVAVKEKlLKKRFl±,VLD 

D\MSESYADWEI?ERPHJ^GMGSKIIMTTF«<QSLLTKLGYKQPYNI^LSHDSAI^LFCQHALG 

DSHPTLKFHGEGIVEKCAGLPLALST 
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EFGVGKTTU^LLYEEMQGKDHreLKAWVCVSDEroimiSKIILQSlGGGNQEFTDLJ^LJJ^VA^^ 

|<RFLLVUDD\MSESYTDWEI?ERPFUGAPGSKIiriTRKl51±NKLQYNQPYNL5>rt^HENA^ 

ALGEDNFNSHPTLKPHG7GIVEKCDGLPLALS 
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5^(2^^ Ak9 .'2; 

1 TTNACACCAT AAATTCTCNA CCTGNGGGGA CAAAAACCTA AAAATGGTCC ATAATGCNCA AATCAGNAAG 
71 GTTGANAAAG CTCTAAGnT TJmCCTQCA NCTGATGCNC NNTCCTOJTA AAGTTCAMAT CCAAGCTTGC 
141 CCTCCAACTC TAIJCNCCTTC AATGGCACCT CCTTCTCTTC AAAAGCACAC AAGAACACTT TCAAGCTCM 
2X1 CCACACICAC ACAAGCTCTA GAACNAGGGT TAGGGCACAT TTAGGGTTTT GCTCTCTGGA AATGGTGTCT 
281 AAAAGTGAGG CCATAATOTT CCTTATATAA GGCTCACTCC CACAATTAGG CTTTCAATCT GAACaSTANTA 
351 CGCCCAGTOT ACAC r A lWl' ACGCCCAACG TACTCGGTAG TCTCCGCGTC AANAATACAC TCATGAGTAC 
421 GCGCAACGTA CriTCCCTTA CGCCCAGCXTT ACTCAAAAGC CAAACATTCT TITCAAGGAC TAATTTTGAC 
491 AACTTSAGGA AAGAAAAGGA TCAAAGAKAT ATACTTCSAAT TCCGGGATGT TACAATCAAG TTOANACCTr 
561 GGCTAAAAAA TTAAATTOGT TGTGGAAGCC GTTGGCTGAG CAAGCAACAA GGGTAAAATT CGTAATCTAC 
631 AAATGGrrGTT ATTTTCTATT TCTTCTTATT ATTITAC7ITG ATTTACGGGT AGTITITnT TCTTACAAAA 
701 AATATTAAAG 1TGATAAAGT ATAGCCACTA AAATTGACTT TTTCCAAAAC ATAATGTCAA ATGGTGCGTA 
771 T ATCTAT CAT GTTGTAITAN ATAATGAATA TGATGATOCT GTrCTATTTA ANCCGAAAAA ATTATCTAAT 
841 GATTITATAT TGGAAAACAA AGTrGTGATT TTTOGCATAA TATAATCAAA TCXNCTTTTG TNTGGGAGGT 
911 GGATAAATGT GGTAAATTTA NAACAAGTGT TTTNTOrTTG AAGGGrTNTGG AAAGGTTGAA AAAAGTTAAA 
981 ATGATAAAAT GTTTACACAA ATGTTGTATC CGACTGAATA imTCnTTAA GGAlNATTGr ATTAAATTGr 
1051 TGATATAT AG TAAGCATAAA TATTTAGAAT TGTGACTTAA ATTTATAAGT TAINCNAACT GGA1TGAAAC 
1X21 ATTmC ATA TAHATTAGGA ATGAAAATGA GCAACCCTAA CATACTTATC TITGGTAGIT TGGTTATTAT 
1191 A'iTl TrAT TA NAATA TAGAA NCATCCCITT ATTTTAAACC CATATTGTGG AaSGACTTGA ATAAATGGGA 
1261 AAAATOTACC TTGCTATriA GCACAAAAAA ATTATAAAAA TGTACATTGC TATTTAQCAC AAACAAAAAA 
133X AAAAAACTTA TCCTTOTGC ATTAGGTCAC AAAGAAATAT AAAATGGGAA A T GTOTIG CT ATTIAATGCA 
X40X CTAAAAGAAA CrA'lTl 'llX:C TTTATTAAAC CGGGTAAACC AATRGAAAAA TOGAAGTACA TTGTCATTTA 
X471 GCATSAAAAA AAATAACTTT CCAl'm TiXi CATCCGGTCA CAATAATAGA AAAATGAAAG TACCTTOCTA 
1541 TTTAGCGAAA CTAACTICCr TITTTCTTTT TGGCATCGTA TCATAAAATA TAGACTAAAA TACGTTAGTr 
1611 TTACATr m' AATACATTGA AATGTCTAAT CCACAroiTA TTCTATAAAA AGGGAAATGT AATTTACTTA 
1681 TTCTTTSATT CTT TGGCT TC TTTITAGTAC CCAAAACATC CCTCTATCXIA TCTATTCCAA CTAAAATAAT 
1751 GAAAACTATA TrCCTTCCAT TCTAGGGATG TTATAAATTT TGrEAATTCTT TTTATGCAAA AAAGTCTITT 
1821 TTGTrAACTA G ATTAACGA G ATTCATTTIT CAGCATTTTA GGAGAAGTTC ATCCATCITT TGGATATGAA 
1B91 GTGC AAGCCA AGnCTTTAA CATGGAATAT GAGGTCCC TA TATGCTCAAA AAATAGCAAA TGAGAAATTT 
196X TTTAAATrGG ATCC CCATA A AAGAAAATTT GTTAATGGTT GmTAATAT TGGTCAATGT GTCCACCGGA 
203 X TGAGCA1AAT ACTAGTITAT AAQGGGTAAA GGTGGGTITG GTGGGCCCAT TTATCTTTAT TATITCTAAA 
2X01 AGTCAGAATT AAGTA AAAAA AATTATAAGA TAAATACCAT AAGGATAAAA AATCATTTTA TTIGGACCAA 
2171 AGACCAAAGT TGTTAAGGGG CICTrrGTTT TTTTICTGAA GAGCTGTGCA ACCACmTG TCTGCGCCGC 
2241 ACAGACAACG TGCAGACATA TGCCCTCGCA GAGTGTTTGT TnTTGAAAG TGCGCAGACC AAAAAAACGT 
2311 CTGCGCGAGG TCATCCTGGC GCATATATGT GTCACTGTCT TCAAAGGTCT TCAGACCTCA TnTAACCAA 
2381 AAAAA AAAAA GACCACCGGT TTTTmnT TrmOTTCT TTCTCTTGTA GCTGAAAATG CATTnTAAT 
^^^'^^^^^ IGAAATTAAG TTTGAAAAAT TAATTTATTr CAACAGCTGT AGACGTTAAA AACAAACAGT 
2521 CTTCTrOTTG CAGACTGTGG ACATTTGGTC CACCTCTICT ACCGCAGAGA CTTGCAGATG TGGTCCGCAG 
2591 ACTGCAGACA TITIGGCTTC AAATAAACAA ACATCACCTA AITTGACTAC ACCACACGGA CCTCCAATGT 
266X AA CAAAAA AA AGGITGAAAC AAAGTIGCCT ATTTCTCCAT AICCAGGGGC CATITATGTA AGAGTTATCT 
273X AAATTTC^GT TCGGTA GATC AGTTCTCACA TTTTAACCGG GTAAAGTGTA TG^ICTGTACG CXXXXIACXnG 
280X AAAGGTTTGA ANGTAACTTC CAAACTGAAN CAANAATCGA TATGAAGTAT CAAGTTRGAG GTTCAATTQG 
287X TGAA GGAAT C AGCTGGAGGT TGGGG AATCG AGCITCCACT ATTAAGGTAA AATCCATAAC CCTAAATCnT 
2941 GGTACGCTCA TATATCAAAT TGCGICTnT GTTGAATGAA AAAAGCATGC TCAAAAAACX: AGTGTAAGGC 
30X1 ACGG TATAT G ACATATTTAT AG nACTG AT AACAAATTAT GATAATTTIG GGTTTACXSTA AGTTAGGATT 
3081 COTACr:nCAA CCAAATGTAA TAGTTTTTGT GAGTCTATCT ATGTATTTGG GGAATCACAT TAGCAACGOG 
3151 A TTOTA CTAG TAATTCGAAA AAGTCmTA AATAATmT CrGTITATAA TTTATGAATA GTITTAGCGA 
3221 CATCTAATAT TAAATAGAAT GTATCTGATA TTGAATTAAT GTCCTTAATG TGAACATAGA CCTmCCAT 
329X TTACTAATGC CrAA TTATTA GTITCTAATC AATAAATTTT AATTTCTGTr TTATGCTTCT AAGACAATAA 
3361 AAATCCATGA TTTACCTTTA AATATTAACA AAAATGACCA TAAATAAATA AAAAATTAGG ATACCAAACC 
343X C CCCCGCC AT GCCCA ATGTC TAAAT ATTCT TGATGCTTTT GCTnTCCCT LTlTXlXri ' m TTAGTCTATT 
350X AncrmAGA GTTTCAGAGA GTTTCATACA AGAAAATTTC AAGAAGAAAG CAAAGGTCCA GGTATTCTCT 
357X TITCTIAATr ATCTAT TAAC TEACAAGCAT TTITIACAaS ATCCATGGTT 1TTTGTGTAT GTTTTTCAAA 
364X TTG AAACTAG ATTGGGACTr TTGCCCnGA TGATTCATAA GATATTGCAT GGAGTTGAGA TTGTGTAAGA 
37XX AAACTG3IGA ATAGAAAGAG CAAGTGAATC CAGATATAGT ATTGGTAATA TATGATGATG AGATAGAGAT 
37BX ATGTTAAAAC TGGCI AGAAA ATTGTTTTAA TTTGAAATTT AGGTIGTTGA ATTTGAAAGA TACCAAGCTA 
3851 A TAACTAA TT AGTTATGCTA AATAGTTATA AAGAACAACA AACTCGTAGT TnTTTTTCA TGATTTTCAA 
3921 CCrCTTCGTA CCAAACTAAA TTATAACAAA ATTGAATATC ATTCTCIGCA AlCAATTTTA ACTnTCTXA 
399X TIATCMCAT GTCTAAAATT GCCACAAGTT TATTITCATA GTCATATTGG ATTATGAAAG GACTATmT 
406X ACCAATTACA TCnTACnT ATGGCCAAAG CTAATACAAT CCGACTAAAC TAAAGGATTC TAGGATGCAT 
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4131 ATAGTTTGCT CCCCGATTAT AGATTTCTAT CTAATTTGTC TATTGTACTA ATTTAGGTGC CACCACAAGT 
4201 AAATTCCTGA AATGGATGTC GTTAATGCCA TTCTTAAACC AGTTCTCGAG ACTCTCATGG TACCCGTTAA 
4271 GAAACACATA GGGTACCTCA TTTCCTGCAG GCAATATATG AGGGAAATGG GTAT CAAA AT GAGGGG ATTG 
4341 AATGC7ACAA GACTIGGTGT CGAAGAGCAC GTGAACGGGA ACATAAGCAA CCAGCXTGAG GTTCCAGOOC 
4m AAGTCAGGGG TTGOTITCAA GAAGTAGGAA AGATCAATGC AAAAGIGGAA AATTICCCTA GCGAICTIOS 
4481 CAGTTCnTTC AATCTTAAGG TTAGACACGG GGTCG6AAAG A GAGOCTCCA AGATAATTGA GGRC ATOGAC 
4551 AGTCTCATSA GAGAACACTC TATCATCATT TGGAATGATC ATTCGATTCC TTTAGGAAGA ATIGATTCCA 
4621 CGAAAGCATC CACCTCAATA CCATCAACCG ATCATCATGA TGAGTTCCAG TCAAGAGAGC AAACTTICAC 
4691 AGAAGCACTA AAOGCACTCG ATCCTAACCA CAAATCCCAC ATGATAGCCT TATGGGGAAT GGGCGGAE3TG 
4761 GGGAAGACGA CAATGATGCA TCGGCTCAAA AAGGnTGTGA AAGAAAAGAA AATGTTTAAT TTrATAATTO 
4831 AGGCGGTTC3T AGGGGAAAAA ACAGACCCCA TTGCTATTCA ATCAGCTGTA GCAGATTACC TAGGTATAGA 
4901 GCTCAATGAA AAAACTAAAC CAGCAAGAAC TGAGA AGCT T CQGAAATGGT TTGTGGACAA TTCTG G UIX;!' 
4971 AAGAAGATCC TAGTCATACT CGACGATGTA TGGCAGTTTG TGGATCTGAA TGATATTGGT TTAAGTCCTT 
5041 TACCAAATCA AGGTGTCGAC TTCAAGGTGT TGTTGACATC ACGAGACAAA GATGTTTGCA CTGAGATOGG 
5111 AGCTGAAGTT AATTCAACTT TTAAT GTGAA AATGTTAATA GAAACAGAAG CACAAAGTTT ATTCCACCAA 
5181 TTTATAGAAA TTTCGGATGA TGTTGATCCT GAGCTCCATA ATATAGGAGT GAATATTGTA AGGAAGTGTG 
5251 GGGGXCTACC CATTGCCATA AAAACCATQG CGTGTACTCT TAGAGGAAAA AGCAAG GATG CATG6AAGAA 
5321 TGCACnCTT CGTTTAGAGC ACTATGACAT TGAAAATATT GTIAATGGAG TZTXTAAAAT GAGTTACGAC 
5391 AATCTCCAAG ATGAGGAGAC TAAATCCACC TmTGCTIT GTGGAATCTA TCCCGAARAC TTTGATATTC 
5461 TTACCGAGGA GTTGGTGAGG TATGGATC3GG GGTTGAAATT ATTTAAAAAA NTGTATACTA TAGGAGAAGC 
5531 AAGAACCAGG CTCAACACAT GCATTGAGCG GCTCATICAT ACAAATTICT TGATGGAAGT TGATGATCTT 
5601 AGGTGCATCA AGATGCATGA TCTrGTTCGT GCTTTTGTTT TGGATATCTA TTCTAAAGTC GAGCATCCTT 
5671 CCATTGTCAA CCATAGTAAT ACACTAGAGT GGCATGCAGA TAATATGCAC GACTCTTGTA AAAGACTTTC 
5741 ATTAACATGC AAGGGTATGT CTAAGTITCC TACAGACCTG AAGmCCAA ACCTCTCCAT TTTGAAACTT 
5811 ATGCATGAAG ATATATCATT GAGGTTTCCC AAAAACTITr ATGAAGAAAT GGAGAAGCTT GAGGTTATAT 
5881 CCTATGATAA AATGAAATAT CCATIGCTTC CCTCATCACC TCAATCTTCC GTCAACCTTC GCGTGTnCA 
5951 TCTACATAAA TGCTCGT TAG TGATGTITCA CTGCT CrXOT ATIGGAAATC TGTCGAATCT AGAAGTGCTT 
6021 AGCTTTGCTG ATTCTGCCAT TGACCGGTTG CCTTCCACAA TCGGAAAGTT GAAGAAGCTA AGGCTACIGG 
6091 ATTTGACGAA TTGTTATGGT GTTCGTATAG ATAATOGTGT CTXAAAAAAA TTGGICAAAC TGGAGGAGCT 
6161 CTAIATGACA GTGGTTGATC GAGGTCGAAA GGCGATTAGC CTCACAGATG ATAACTQCAA GGAGATQGCA 
6231 GAGCGTTCAA AAGATATTTA TGCATTAGAA CTTGAGTTCT TTGAAAACGA TGCICAACCA AAGAATATGT 
6301 CATTTGAGAA GCTACAACGA TTCCAGATCT CAGTGGGGCG CTATITATAT GGAGATTCCA TAAAGAGTAG 
6371 GCACTCGTAT GAAAACACAT TGAAGTTGGT TCTTGAAAAA GGTGAATTAT TGGAAGCTCG AATGAACGAG 
6441 TTGTrrAAGA AAACAGAGGT GTTATGTTTA AGTGTGGGAG ATATGAATGA TCTTGAAGAT ATTGAGGTTA 
6511 AGTCATCCTC ACAACTTCTT CAATCTTCTT CGTTCAACAA TTTAAGAGTC CTTGTCGTTT CAAAGTGTGC 
6581 AGAGTTGAAA CACTTCTTCA CACCTGGTGT TGCAAACACT TTAAAAAAGC TTGAGCATCT TGAAGTTTAC 
6651 AAATGTGATA ATATGGAAGA ACTCATACGT AGCAGGGGTA GTGAAGAAGA GACGATTACA TTCCCCAAGC 
6721 TGAAGTnrr ATCTTTGTGT GGGCTACCAA AGCTATCGGG TTTGTGCGAT AATGTCAAAA TAATTGAGCT 
6791 ACCACAACrC ATGGAGTIGG AACTIGACGA CATTCCAGGT TTCACAAGCA TATATCCCAT GAAAAAGTIT 
6861 GAA ACAT TTA GTnCTTGAA GG AAGAGG TA AATATAAA1T T TTAAT GCTA ATACAT TACA AAGGATCTTT 
6931 TCAGTTAAAT CTTTCAAAAT ATATIGTAAT TTGATTGTAT GGGC?rATTAT 'lliTlUGAlGG GACnVTTAAT 
7001 AAATGATTAT CTIGCAGGTT CTGATTCCTA AGTTAGAGAA ACIGCATGIT AGTXAGTATGT GGAATCTGAA 
7071 GGAGATATGG CCn UCGAAT TTAATATGAG TGAGG AAGTT AAGTTCAGAG AGATTA AAGT 6AGTAACTGT 
7141 GATAAGCTTG TGAATTTGIT TCCGCACAAG CCCATATCTC TGCIGCATCA TCTTGAAGAG CTTAAAGTCA 
7211 AGAATTGTGG TTCCATTGAA TCGTTATTCA ACATCCATTT GGATIGTGTr GGrTGCAACTG GAGATGAATA 
7281 CAACAACAGT GGTGTAAGAA TTATTAAAGT GATCAGTTGT GATAAGCTTG TGAATCTCTT TCCACACAAT 
7351 CCCATGTCTA TACTGCATCA TCTTGAAGAG CTTGAAGTCG AGAATTGTGG TTCCATTGAA TCGTTATTCA 
7421 ACATTGACTT GGATTGTGCT GGTGCAATTG GGCAAGAAGA CAACAGCATC AGCTTAAGAA ACATCAAAGT 
7491 GGAGAATTTA GGGAAGCTAA GANAGGTGTG GAGGATAAAA GGTGGAGATA ACTCTCGTCC CCTTGTTCAT 
7561 GGCTTTCAAT CTGTrGAAAG CATAAGGGTT ACNAAATGTO AGAAGTTTAG AAATGTATTC ACACCTACCA 
7631 CCACAAATTT TAATCTGGGG GCACTTTrGG AGATTTCAAT AGATGACTGC GGAGAAAACA GGGGAAATGA 
7701 CGAATCGGAA GAGAGT AGCC ATGAGCAAGA GCAGGTAAGG ATTrCAATTT CACTGTCTTA ATTAATGATP 
7771 AAOCTCCTGC TTTTrGAATA AAAAAGGGAC AAACCATTTC ATGACTTAAT GTAGCAATAC AAGTCATC3TA 
7841 TAAGAGTGAC CAACICmT TPATTTATAA AATGACTACA AAATATITTT TTTCATTAGA GATCATGTAT 
7911 AAATCjIGACT AATTTTTCAT CACCTAACTT TAGTTGATAA ATCTTTATAA ATGTCACTAG TTACTTnCA 
7981 GTAAAATAAC AAATFTAATA AATTATCAAC AAAAAGCATC AACTAAAAAA ATCCCACAAC ccgtaataat 
8051 TIAAAATAAA AGGATTTAAC ATCTAATACG AACAATTTTT TTTCTAAACA TGATITGGAC caaatatcac 
8121 CAGCAACTCA AGTITGGAAT CGATTCAGCT TAAAACTTGA CCAGCATAAT TAGATAGATG AGAGTTGAAG 
8191 CTAAAGTGCC TATATAAGTT CGTITCATCT TTTTrCTTGA TCTIGATAGC AAGTTGAATG ATnTCTTCT 
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8261 TCAAAATTCA TAAAAATCTA CATTATAAAG AGACTAGCTT GAAAAAAAAT GGTCTAGGTG GGTCTTGGGT 

8331 TCTGGTAGAT GAAGATGGAA GGGGAGAGTA TGATTTCAAA GACACAACAC ATCCTTCATT TTATTTATIT 

8401 ATTATTATTA TTATmTTG ATATCTTGCT CATATTTGrT ACAGATATGT GAGGTCTATT AATCmTTA 

8471 AATATATAAA AAAATAAATA ACATAAATGA GAAAATTAAA TAAAGAATAA ATTAATAAGG GCACAATAGT 

8541 cnrrrAGGT aagacaagga ccaaacacgc aacaaaaata aacagtaggg accatccgat ttaaaaaaaa 

8611 TAATTAGCXSA CC AAAAACAT AAA TICCCCC AAACCATAGG GWXATICRT GT AATTTAC r CTTA CTm Xj 
86B1 GmTCTTCA TATTTGGGfTA ACTATTnTT TTGTACftCAT CTAGGTAACCj AACTTGTTCA AGTCTTCCCA 
87S1 TITAOGATGT GACCTACTAC AACCGATCAT AATAGTCATA TGTGAACACT TCCAACAACT TTATTACTrA 
8821 GGTCrrGTACA AAAAAACAAT AGTTACCATG ATGTGAACAT ACTCSAAAAAT TAATTACCTT AGCAAGTTAT 
8891 TTTCCCAT1T AGGTTGTATG GA AACAG TTC CGTGAGACCG TGACTTGGAT GGTAGATAAA TTTAGTAAAC 
8961 TTAACCCnC AA TTAAC CTA CLTriTiUrr ATTAACTCAA TTTCAACCTA AATTCTGATT CTrGTTIGAA 
9031 AGTAAGTIGC ATCTITATTT TTGrATTATC ITCTTGCATA GGATCCTTAG CATCTTTTAA TAATTTATTr 
91Q1 GAAGGTGAAA GATCCAACTA TnTTAATCT GTrGGCATTT TCTATCATIT GCAACTGTTT CTTGAAAAAA 
9171 AAATACCTAA AATCAAAATA ACCATTTTCA AATCCAAAAT TATAAGAGAG AATTGTAAAT GGACATGGAA 
9241 TCATAAATCA TTAACACAGT TCAGTAAACA AGTTGCTAAT TACATTTCTT GCTGTGCAGA TTGAAATTCT 
9311 ATCAGAGAAA GAGACATTAC AAGAAGCCAC TGACAGTATT TCTAATGTTG TATTCCCATC CTGTCTCATG 

93B1 cACTcnrrc ataacctcca gaaacttata ttgaacagag ttaaaggagt ggaggtggtg tttgagatag 
9451 agagtgagag tccaacaagt agagaattgg taacaactca ccataaccaa caacaaccta TTATAcrrcr 

9521 CAACCTCCAG GAATTGATTC TATGGAATAT GGACAACATG AGTCATGTGT GGAAGTGCAG CAACZGGAAT 
9591 AAATICTTCA CTCTTCCAAA ACAACAATCA GAATCCCCAT TCCACAACCT CACAACCATA AAAATTATGT 
9661 ATTGCAAAAG CATTAAGTAC TTGnTXCGC CTCTCATGGC AGAACTTCTT TCCAACCTAA AGCATATCAA 
9731 GATAAGAGAG TGrTGATGGTA TTGGA GAAGT TGTnCAAAC AGAGATGATG AG6ATGAAGA AATGACTACA 
9801 TTTACATCTA CCCACACAAC CACCACTTTG TTCCCTAGTC TTGATrCTCT CACTCTAAGT TTCCTGGAGA 
9871 ATCTGAAGTG T ATTGGTCG A GGTGGTGCCA AGGATGAAGG GAGCAATGAA ATATCTTTCA ATAATACCAC 
9941 TGCAACTACT GCTGTTCTTG ATCAATTTGA GGTATGCTTT GTACATATTC AATTATTTAT TTAATTTCCT 
10011 •nTTTATTTG CAATATTCTA TAAATAATAC ATTTTATACC CACTATACTA AGATAATAAT TACCTAGAGG 
10081 GATGGATGCr ATGACACAGC TGCTACACTT CAGAAACTCT AGTAAGGGCA GTTATGGAAG TTCAATAAAA 
lOlSl TGATAATGGC ATCTTTTGAT GGGTAATATA GGCAATTTAA GTTTTATTTC TGTTAAAGCA GTATTTAGCA 
10221 AGTACTGGCC AGTAGGAGAG GAGAATATCA CCTTTTCTGA AAATCTG G TC ATTGTACCCA GAATTTAC?IT 
10291 AAATGTAACA TTTTAGATAT CAGGGGTCAT CAGGTGACAG ATATTGTAGA ATAGAACAAT ATATAATATC 
10361 ACCCAAAACT ATTTTTTCTA AGGTTATTCT GTTAAATATG TGCTTTCTTG ITTTCAINGA ATTMGCATTC 
10431 GTATATTTTA GGTGTTAAAG TGATTITNTC TTCAATAAAT CCCGAAATTA ATTAAAAAAA AAAAAACAAA 
10501 AGTACAnrr TGATGTGGAG AGCACTGGTA TCACTTAGTA TATAAAAAGC TTGATTTTGA ATTAACTTTC 
10571 TTATACAAAA GTTGTGTATA TAGTTTAATT AGTTTTACAT CATTTITCCA ' i l/mU'llal'lU CAGTTGTCTG 
10641 AAGCAGGTGG TGTTTCTrGG AGCTTATGCC AATACGCTAG AGAGATGAGA ATAGAATTCT GCAATGCATT 
10711 GTCAAiGTGTA ATTCCATGTT ATG CAGC AGG ACAAATGCAA AAGCTGAAGG AGAGGACAGC GATTCTOTrA 
10781 CGAACGGTTA CGATTCGACT GGCCGTCGTT TTACA 
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MDWNAIlJ<PWETLMVPVKKHIGYUSCRQYMREMGIKMRGlJv|ATHLGVEEHVNRNISNQl.£VP 

RGWFEEVGKINAKVENFPSDVGSCFNLKVRHGVGKRASKIIEDIDSVMREHSillWNDHSIPLGRDSTK 

ASTSIPSTOHHDEFQSREQTFTmj^ALDPNHKSHMIALWGMGGVGIOTMMHRLKKWKEKKM™^^ 

BVWGEKTDPIAIQSAVADYLGIEIJJEKTXPARTEKmKWFVDNSGGKKILVlLDDVWQFVDLNDIGLS 

PLPNQGVDFKVUTSRDKDVCTEMGAEVNS^H^VKMUE^B^QSLFHQ^E!SDD\^ 

CGGLPIAIKTTMCTU^GKSKDAWKNALmLEHYDievJIVNGVFKMSYDNLQDEEmff 

1LTEELVRYGWGLKIPKK?YT1GEARTRIJ^CIERUHTNLLMEVDDVRCIKM 

ASIVNHShm£WHADNMHDSCKRI^LTCKGMSKFPTDLKFPNI^IL^^ 

VISYDKMKYPLLPSSPQCSVNLRVmmKCSLVMroCSClGNLSNL£Vl^FADSAIDRIJ'STlGKLW<L^ 

LLI)LTNCYGVRIDNGVLKKLVKLEELYMT\MDRGRKA1SLTDDNCKEMAERSKDIYAL£^ 

NMSFEKLQRFQISVGRYLYGDSIKSRHSYEimKLVl£KGELLEARMNELR<KTE\^ 

VKSSSQlJ.QSSSFNNmVLWSKCAElJ<HFFTPGVANTU<KL£HLE\A'KCDNMEEU 

KlJ<FLSLCGLPKLSGLCDNVKIlEiJ=QIJVIElJEIJ3DIPGFTSIYPMKKFErFSLlJ<EEVLIPl^ 

WNIXEIVVPCEFNMSEEVKFREIKVSNCDKLVNUTHKPISUJHHLEEIJWK^^ 

GDEYNNSGVRIIKVISCDKLVNLFPHNPMSILHHLEELEVENCGSIESLFNIDLDCAGAIGQEDNSISLRNI 

KVENLGKm?\WRIKGGDNSRPLVHGFQSVESIRmC7KFRNVFTPTTTNFNLGALLEISIDDCGENR 

GNDESEESSHEQEQIEILSEKEn.QEATDSISNWFPSCLMHSFHNLQKLILNRVKGVEWFeESESPTS 

REL\nTHHNCXXJPIIIJ3NLQELILWNMDNMSH\AWKCSNWNKn=TlPKQQSESPFHNLTT1KlM 

LFSPUW\ELJ^NlJ<HIKIRECDGIGEWSNRDDEDEEMrrFrSTHTTrnJT'Sm 

GAKDEGSNEISFNNTTATTAVLDQFEVCFVHIQLR. 
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5^02 ri^ NO .* 2 3 

1 AGmrmT tttcccaata tccatttata tgcgatttat ttctgaaata attitatcaa aacgcaggaa 

71 ACAATGTAGA ATAATACTGG TATAATTAAT TATATAAAGT TATTAGGCTG AAATCTTGAG GCTACTATAA 
141 TTTAATTATC ATAATTTGAA AATCATCAAA TTGTATICCA TGTATATTTA TGTTATCAGA TAATTAA'I;AA 
211 TATGTGI^C ACACAAATCC ACATCATCAG ACACCCCACC TTATTGTCGG CTACCTCACC ACTTGCATGA 
281 TCCCGACATC TTCCCAACCC CACCGaCG&C TTCSG Gg lCTC CTTAATATAT CAATTATTTT CTGTAAGTAT 
351 TTATTTSICT AAATGTGTAA TOICATnTA CCTTmrCT AATATATACA GAAACATAAA TTTTAAATCA 
421 AATTCAACTG CGlTTCATrC TTGCATTAAA AAAAAACACT GTACTGTTGT CAATATTTTA CCTATAACCT 
491 GAITAATTAA TTAAAGOGTA ATTGCATAAT TTGCATTAGG TTGTAATnT GTGnTTATA GGGAGXjGiIGA 
561 GGCrrCACCGG GAATCAAAGC ACrTATGTAA AAGCAOGGGA AATACAAAAA ATTmrTCGA AACAAATTIT 
631 ATTCAATTTA AGTGAGATAA TAATC3TTCTG ATTAGATTAT GAGAACTAGG AGAITTAAGT GATATATCCC 
701 AITTAAAAGA AATTGCATTA TrAATTTTGG ATCTCTTGAT GATGACAAAA TTAACTCGTG ACAGGTTATA 
771 TATCATATAC AAAATGAGTG GCTATGCTTT CGCTITCCAA AAAGCAATTA TAGTTATACT ACACCTACAA 
841 ATTITAAAAG GGGTTAAACA TATCAAAATA CTTGATAAGT AATTATATAA ATATGCATTT AACCCTCTAA 
911 AGAAAATGCT ACTAAGCTTG GACCATCTCA GAATTACAAT CATACCCTTC CCCTCAAAAA AGATTCGTAT 
981 . ATATCATGTC ATTTGGCATT CATTTCTTTr TCACAATTCA TAGTrCTATT CTCAAAAAAT TCGAGTICTC 
1051 GTATTTSTAA GGAAGATCAG AAGAGACTGT TCACACAGGT ACTCTCmT ATTTATTGAT TCACATTCAT 
1121 ATATGTTATT GTITrClTGC TTAATCGTIT CGTCAGTCTA ACrGCGCITG CTtSATTTAAA TITCTTCACT 
1191 TTCnCCACG GATnTTTAA ATATTAGnT TGTGAATGAA CAATTGGTGA AGGAAAGAAA CATGGGAGTC 
1261 ' I ' mvrA AAG TAAA CCTAGA T ACTTAGG TT ATAAGG GTAT ATGCTAAAAT GAAC TATGCC CATlCACnT 
1331 TGCCmrCT TlTACTTTTr AGTITTTAGA ATCCAAGTTT TCATATOTAT CTCGATCTGT GAGAAGAATA 
1401 GGCATTAGAA AGGTAAAGGA CGTACATAAA A7TGATIAAT TAGTGAATGT TCTTTGATAT GATTATri'lT 
1471 ACTCTCATAA AAAGCATATA GATCAAACAC AAATIGCTAC TTCJITAGTGT AACAACTTCG ACTTAATAAT 
1541 GTTAATAATC AAGATTCTCT TGATTTCAAC TATTTTCTAA CCGAACAAGC TCACTAAAAA CTCATATTGC 
1611 TTTGAGrrCTG AGTGGTTTAT ATTTGGGGTT TTACATTTAA TTnTTGTGC ATSAATGTGA AAATAGACTG 
1681 CITATTGAIT CTriGTGnT CATTGAGTIG ATTTTCATTA TTACTACCTT ACAAATTGCT CAGTGATAGA 
1751 TTTCCATTAA TTTGCTAATT CGGTTGCTTC TAAATATGTA GGAGCTACTA AAAGCAAAAA TATCGAGCAA 
1821 TGTCGGACCC AACXXX3GA1T GCTGGTGCCA TTATTAACCC AATTGCTCAG ACXX3CCTTGG TTCCCGTrAC 
1891 GGACCATGTA GGCTACATGA TTTCCTGCAG AAAATATGTG AGGGTCATGC AGATGAAAAT GACAGAGTTG 
1961 AATACCTCAA GAATCAGTGT AGAGGAACAC ATTAGCCGGA ACACAAGAAA TCATCTTCAG TTCCATCTCA 
2031 AACTAAGGAA TGGTTGGACC AAGTAGAAGG GATCAGAGCA AATGTGGAAA ACTTICCGAT TQATGTCATC 
2101 ACTTGTTGTA GTCTCAGGAT CAGGCACAAG CTIGGACAGA AAGCNTTCAA GATAACTGAG CAGATTGAAA 
2171 GTCTAACGAG ACAACTCTCC CTGATCAGTT GGACTGATGA TCCAGTTCYT CTAGGAAGAG TTOGTTCCAT 
2241 GAATGCATCC ACCTCTQCAT CATTAAGTGA TGATTTCCCA TCAAGAGAGA AAACmTAC ACAAGCACTA 
2311 ATAGCACTCG AACCCAACCA AAAATTCCAC ATGGTAGCCT TGTGTGGGAT GGGTGGAGTG GGGAAGACTA 
2381 GAATGATGCA AAGGCTGAAG AAGGCTGITTG AAGAAAAGAA ATIGnTAAT TATATTGTTG GGGCAGTTAT 
2451 AKGGGAAAAG ACGGACCCCT TTGCCATTCA AGAAGCTATA GCAGATTACC TCGGTATACA ACTCAATGAA 
2521 AAAACTAAGC CAGCAAGAGC TGATAAGCTT CGTGAATGGT TCAAAAAGAA TTCAGATGGA GGTAAGACTA 
2591 AGTTCCTCAT AGTACTTGAC GATGTTTGGC AATTAGOTGA TCTTGAAGAT ATTGGGTTAA GTCXnTTTDC 
2661 AAATCAAGGT GTCGACTTCA AGGTCTTGTT GACATCACGA GACTCACAAG TTTGCACTAT GATGGGGGTT 
2731 GAAGCTAATT CAATTATTAA CGTGGGCX:TT CTAACTGAAG CAGAAGCTCA AAGICTGTIC CAACAATITG 
2801 TAGAAACTTC TGAGCCCGAG CTCCAGAAGA TAGGAGAGGA TATOGTAAGG AAGTGTTGCG GTCTACCTAT 
2B71 TGCCArTAAAA ACCATGGCAT GTX\CTCTTAG AAAT7UUAGA AAGGATQCAT GGAAGGATGC ACTTTCGCGC 
2941 ATAGAGCACT ATGACATICA CAATGTTSCG CCCAAAGTCT TTGAAACXa^ CTACX3^CAAT CTCCAAGAAG 
3011 AGGAGACTAA ATCCACTTIT 1TAATGTCTG GTTTGTITCX: CGAAGACTTC GATATTCCTA CTGAGGAGTT 
3081 GATGAGGTAT GGATGGGGCT TGAAGCTATT TGATAGAGTT TATACGATTA GAGAAGCAAG AACCAGGCTC 
3151 AACACCTGCA TTGAGCGACT GGTGCAGACA AAnTGTTAA TTGAAAGTGA TGATOTTOGG TGTGTCAAGA 
3221 TGCATGATCT GGTCCGTGCT TTTGTTTTGG GTATGnTTC TGAAGTCGAG CATGCTTCTA TTGTCAACCA 
3291 TGGTAATATG CCTGGGTGGC CTGATGAAAA TGATATGATC GTGCACTCTT GCAAAAGAAT TTCATTAACA 
3361 TGCAAGGGTA TGATTGAGAT TCCAGTAGAC CTCAAGTTTC CTAAACTAAC GATTTTGAAA CTTATGCATG 
3431 GAGATAAGTC GCTAAGGTIT CCTCAAGACT TTTATGAAGG AATGGAAAAG CICCATGTTA TATCATACGA 
3501 TAAAATGAAG TACXTCATTGC TrCCTTTGGC ACCTCGATGC TCTACCAACA TTCGGGTGCT TCATCTCACT 
3571 GAATGTTCAT TAAAGATGTT TGATTGCTCT TCTATCGGAA ATCTATCGAA TCTGGAAGTG CTGAGCTTTG 
3641 CAAATTCTCA CATTGAATGG TTACCTTCCA CAGTCAGAAA TTTAAAGAAG CTAAGGTTAC TTGATCTGAG 
3711 ATnTCTGAT GGTCTCCGTA TAGAACAGGG TGTCTTGAAA AGmTGTCA AACITGAAGA ATnTATAlT 
3781 GGAGA'TSCAT CIGGGrTTTAT AGATGATAAC TGCAATGAGA TGGCAGAQCG TTCTTACAAC CnTCTGCAT 
3851 TAGAATICGC GTmTTAAT AACAAGGCTG AAGTGAAAAA TATGTCATTT GAGAATCTTG AACGATTCAA 
3921 GATCTCAGTG GGATGCTCTT TTGATGAAAA TATCAATATG AGTAGCCACT CATACGAAAA CATGTIGCAA 
3991 TTGGTGAOCA ACAAAGGTGA TGTAT TAGA C TCTAAACTTA ATGGGTTATT TTTGAAAACA GAGGTGCnT 
4061 TTTTAASTGr GCATGGCATG AATGATCTTG AAGATGTTGA GGTGAAGTCG ACACATCCTA CTCAGTCCIC 
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413X TTCATTCTGC AATTTAAAAG TTCTTATTAT TTCAAAGTGT GTAGAGTTGA GATACCTTTT CAAACTCAAT 
4201 CTTGCAAACA CnTGTCAAG ACTTGAGCAT CTAGAAGTTT GTGAATGTGA GAATATGGAA GAACTCATAC 
4271 ATACTGGAAT TGGGGGTTGT GGAGAAGAGA CAATTACnT CCCTAAGCTG AAGTmTAT CTTTGAGTCA 
4341 ACTACCGAAG TTATCA AGTT TGTGCCATAA TGTCAACATA ATTGGGCTAC CACATCTCGT AGACTTGATA 
44U CTTAAGGGCA TTCCAGGTTT CACAGTCATT TATCCGCAGA ACAAGTTGCG AACATCTAGT TTGTTGAAGS 
4481 AAGGGG TAGA T ATATGT ICT TTATCITAAT ACAATrTRAA TAATATTTTC AAOCAAATTT TCATAATATA 
4551 AV-i ulAA TTr GArr GTATGA TCTGTTATTG TTIATATGTG GCTATTAAGG GATGATTATT TTCCAGGTTC 
463X TGATTCCTAA GTTGGAGACA CTTCAAATIG ATGACATGGA GAACTTAGAA GAAATATCGC CTIGTCAACT 
4691 TAGTOGAGGT GAGAAAGTTA AGTTGAGAGC GATTAAAGTG AGTAGCTGTG ATAAGCTTGT GAATCTATTT 
4761 CCGCGCAATC CCATGTCTCT GTTCCATCAT CTTGAAGAGC TTACAGTOSA GAATT G O G GT TCCATTBAGT 
4831 CGTTATrCAA CATTGACTTG GATTGTGTCG GTGCAATTGG AGAAGAAGAC AACAAGAGCC TCTTAAGAAG 
4901 CATCAACGTG GAGAATTTAG GGAAGCTAAG AGAGGTGTGG AGGATAAAAG GTGCAGATAA CTCTGATCTC 
4971 ATCAACGGTT TTCAAGCTGT TGAAAGCATA AAGATTGAAA AATGTAAGAG GTTTAGAAAT ATATTCACAC 
5041 CTATCACCGC CAATTTTTAT CTGGAGGCAC TnTGGAGAT TCAGATAGAA GGTTGCGGAG GAAATCACGA 
5111 ATCAGAAGAG CAGGTAACGC TTTCAATTTC ACTITCTTAA TTAATTAAGG ACTAAGCTCC TGTTTnTGA 
5181 ATAATAAAGA GGTGGGATGA CTAAACTTGG GCATCACAAT TSCAACAAAA TGTTACAAAC CATGAAACX3T 
5251 TCAAACCATT ivxl\ iAATTA AGGTITCAAT ACAAGTCATT TAAAAATATG GCTTAAATTT TTTTTATATr 
5321 TATG TATCA A CATGA 'XTlTr CATTAGAGAT CATTATTATA ATAGTAAGTT TAAAGCAATT TAAATCAGAA 
5391 CTAA i'ivA'AA CTTTAGCTAA TAAATCGTTA TAAATGTTAAA TAATTACTTT TTACSTGAAAT AAGCAACGGA 
5461 TTTAATAAGT TAACAACTTA AATGTCATTT CCTAACAAAA AAAACTTTGG TTCAGAAAAA CCGCAATTCA 
5531 AGATAACTAA AATAAAAATA TTT GACATTC ACTAAGAGCA TTrmTlTC TAAATATGAT TGCAAATGAA 
5601 TAAAACTTAA ATTTATACAG AAAATTCTTT TATATATGTT ATACAAAATT TACAAATTGA AATTGGATAT 
5671 GTTAATTAAC GG TITATA AT TCTGGTATCA CAAAGGGATA TATAATAAAA TATTATnTC TGTAGTCATT 
5741 TGTAATTGTA CTAGTrTATA ACCCGTGGGA ACCATGAGTT CTAAAATTAG TTAAACnTC ATAATAAAAA 
5811 TTTA TAATTA TTATTTATTT TAAATAAATT ATTAATTAAG AGATATATCA AAAATTTAAA GTTATTATAA 
5881 CTTCAAATTT AAC ATATAAT TAGAAAATAT ATGATCATAA CTTCTGCACT CTCnTGTAT AAATGCAGAG 
5951 AAGCTATTAG TATATTTCTA ATCAAGTCCA AACCTAATGA AGCCTATATA ATnTGTGAA AACTCAATTA 
6021 GCATTAGG TT TTAAGAGTCA CCAAATTCAA AGAATAATCC AATGCTTTCA TTACGACTAT GGAGAAAATA 
6091 'm^CTTAGT T TAAAT GAAA TGAAAACAAA CATTCAAACT AATrCTTGCT TATTAAACCA AAGACCCATT 
6161 ACTTAGCCAA GAGTTT AACA AAAAAAAATT ACATTCATOT ATCATTATTC ATGACTAGAT ATATATGAAC 
6231 ATGAAGGGAG TITXTATAGA AAATATAATC ATAGATATTC AACATAACTT CAGGGAATTC CTCAAAATAA 
6301 CCAAGTTATT CAAGAAATTA CATCCAAGTC AACCAAAGAG AAOTITAGCC TAGCATGGCT AAACTCAAGA 
6371 A ACTAAAAT A A GGATTAG AA GTACCAAACA TGTAGTAAGA ATCACACTAA AAGATGATGT TGnCTTGAT 
6441 GTrcrrCTAA GTTCTTCAAG TCTCCAGTTG CTCCTAATAA TGCAAAGGAG AGCCATTAAA TTCGTATGTA 
6511 TTGATCCCTT CAAAAGCTGC ACCAACCTCC CTTAAATAAC ACICAAAGCA AAAATGACAA AATGCCCTGA 
6581 AGGACC CTAT GTGGGTGCCT TGCGCGGGTG GAGCTGCATA CGAAAGGTCT 'nULT l V m i; TGAGGGTGAT 
6651 C'1 TCTX:;CGGG ATAGCTTCTC GCATGCTTCC GCGGGGTTCA CGCACATGTG CACAGGTGAT GCATGGTGTC 
6"?21 TG Uy^-iVi-i'G AGmTGAGC CTCCGATGCT TAGTCCACTT GGCCCAATTC GAGTCCAATC AGCTTATAAC 
6791 CCATTTTTCT TCAAGTTATC TTCAAGTTAA GCCCAATTTG GCITCTCCAA ATCATCCATA ACTTCACAGA 
6861 ATCGC CCGTT CATCTTAATC CCGGATGCAC AATTATTCTC CCGTCTTCAT TTTAAGCAAG ATACCACCTT 
6931 CTTCATGCTT CATCCATCAA TAGTACACTT CATGTATCAT CTCTACTAGT TATTTAGTCC ACAAATCCIT 
7001 GTTGTCCTCC AAATTTAATT ATCTCATTTA GTTCCCCGIT CCGCTACTIT CCITAAAATT TGGAATTAAG 
7071 CTCAGAGAAA TATTAACyTAC CCGAAATQGT CATAAAATTA ACAAAAAGGA AAATGCATGA AGATTAACTA 
7141 AA TGATGA AC G AAATAT GCT AAAATAGACT ATAAAATGAA GTAAATAAAA TGAAATTATC GCACTCCQAC 
7211 CACCCTTATG GCITGTAGTC CAC CCACCC T TCATICCTTG TACCAATATC GGATCGAAAC ATCATTAATT 
™^ ^:^EEE^i^!^^ GCTAACATAT AAGGGTTIftG TGACAAAGGT AAGTACTAAA GATGAAAATA ATCCA TATIT 
"^351 CTTCTmTA CACAA CACAC ACATAGGGGC AGACC3TAGGA TTTCAAAGTA CAGATTGTrG GTOGCACATA 
7421 AGTOnCCIG G TGACATTTT TTTTTrCTTT TTACGTGGTG GCACAACAGT AGGAAAAACG AAAAATTCGA 
7491 AATmTTAC AATITGTCTT AAAAAAAACA GGGGITCTTG GTGCCACTAT GGACAACAAA GTIGAACTGC 
7561 CCTACGCGCG CACACA CACA CACACACATA GAGAGAGAGA GAGAGAGAGA GAGAGAGAGA AAGAAAGAAA 
7631 GAGAGAGAGA GTITGGGATG TGATACTTCT TTTAGGAAAA TGGAGTTATA TCTTTGATAT TGT A ' I ' I ' mT 
7701 TAATGTAATT TAT OTAT TTA ATCATTTTAG TTTATAAGTT NTATTTATIN GGOTATGAAA AAAAAAGTCT 
7771 TTTATACATT GGATTTAACA TAAAAATCCA ACAATATTAA TCAAAAAGAC CAAACATGTG GACAATTATX3 
7841 TATATAATTA ATTCA CAATA GTCTITAGGA ATAGTATTAT ATATATAATT AATTCTCAAT GGTCTTAGGA 
7911 ATAGrrAAGTT CTTATATITC AAACTTITGC CACAATTCTT TGCTTACTTT GACACmTC CTTCCTAACT 
7981 TTACATATAT ATATAT ATTA AAGC GCAAAG GTCATAGGAA TATAATATTT TCTATTATTC TACGITrroC 
8051 CACAAAACTT TGA ACACIT T GCCACTmT CyTCCCrcCTT AACCmTCA ATGTmGCG ACAAAAGTIC 
8121 CAAAACTTTG CCACITTGAT CATICCTCAA CTmCACCG CATTAGTITG TGGAGTIGGC AGTTTTOGIC 

8191 ccrcrAAcrr cgatattctc tactgctagc caaaaagggt tccagagxtt avcAcrrrrG gtccctgaca 
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8261 GTAACX:AAAT GTGAGATGTC AAATTTTTGC CAGATTAGTT TGTOGAGTTG TCCCmTGG TCCXXrCACA 
833X rrCGATATTC TOTATTTTT CTCAAATAAC AACACC rftTA TTTCATCzCT AATTQGAAAA 

8401 AGAGTITTAA AA:AAATAAC GACTAGG: : : G:GC:GAGTr TrTTTTiACA AGTTTGTATC AAATCATJVTC 
8471 AAAATTTAAG GTGGAACGGT GACCACATTA ACCAGAAATG TAATTTATrC TTTGATITrG ATAATmTA 
8541 ATATTTTGTr GTGATCTATG TATTTAAAAG TAAACAACAA AGAACATAAT CCAAAACCCT AAATTGCAAG 
8611 TCTCGCCCAA TTTCTCTATC. ACTAGTCCTC ACTTACGATiS GCGTTACGTC GCTCTCTCAC TGCTTACAAC 
8681 CCTTTGTTGC TACTCATTAC AATAACGAAA AGTTGAATAT CCATATATIT ATITGGATGT GGAATTGAAC 
8751 GAATCTCGTC AAAATTTTSA TnTGTTGAT GGATTTGAGT AGAAGITIGG GCAGAACGGG AATGATGGTC 
8821 TGCAAGTGGT TATAAACTTG ATrCTGAGTT ATTACTATAT ATGTAGCCTC TTTACAACGA CCAAGGnTC 
8891 TTCCAGGTAC CATITGATCT TnTAGAACT TAGTnTCTG AAACACCCTG ATTTGGATCA AATATCACCA 
8961 ACAACrCTTA AAAACTTGAT TAATCAATTG TTTTCTTCAT CTTGATAACA AGTGGAATGA TnTCTACTT 
9031 AGATTAACrr GAAAAAAAAG GTCCATGTGC GTCTG GTGG A TCTGG TAAAT G AAGATGG AA GG GAGAGCTG 
9101 ACTTTAAAGA CACAAACACG TCACCATATC TCTTATTOA CTTTA AATIT GCTTITGGTG TATmCTIT 
9171 TTTCCrATTT CTTrCTTlCT TGATCTCCAG ATGGTATGTG GTGTGGATAA TTTACACCTA GAGATTGGGA 
9241 ACX5ATGGGAA GGGGTCTGTX3 ATTTATGGCT GGCCGAGTTT TACTTATTAA CTCAATTTCA ACCTAAATTC 
9311 TGATTCTTGT TTGAAAATAA GTTGCATCTr TATTTTICTA TTATCITGTT GC ATAGG ATC CT TAGC ATCT 
9381 TrrAATTU^TT TATTTGAAGG TGAAAGATCC AACTATmT TAGCTGTTGG CATTTTCCAT CATTTGCAAC 
9451 TGTTICnGA AAAAAAAATA CCTAAAATAA AAATAACCAT TTTCAAATCX: AAAATTATAA GAGAGAATTG 
9521 TAAATGGACA TGGAATCATA AATCATTAAC ACAGTTCAGT AAACAAGTTG CTAATTACAT TTCTTGCTGT 
9591 GCAGATTGAA ATTCTATCAG AGAAAGAGAC ATTACAAGAA GCCACTGGCA GTATTTCAAA TCTTC3TA1TC 
9661 CCATCCICTC TCATGCACTC TTTTCATAAC CTCCGTGTGC TTACATTGGA TAATTATGAA GGAGTGGAGG 
9731 TGGTATTTGA GATAGAGAGT GAGAGTCCAA CATGTAGAGA ATTGGTAACA ACTCGCAATA ACCAACAACA 
9801 GCCTATTATA CTTCCCTACC TCCAGGATTT GTATCTAAGG AATATGGACA ACACGAGTCA TGTCTGGAAG 
9871 TGCAGCAACT GGAATAAATT CTTCACTCTT CCAA AACAAC AATCAGAATC CCCATICCAC AACCTCRCA A 
9941 CCATAAATAT TCTTAAATGC AAAAGCATTA AGTACTTGTT TTOGCCTCTC ATGGCAGAAC TTCnTOCAA 
icon CCTAAAGGAT ATCXIGGATAA GTGRGTGTGA TGGTATTAAA GAAGTTGnT CAAACAGAGA TGATGAGGAT 
10081 GAAGAAATGA CTACATTTAC ATCTACCCAC ACAACCACCA CTTIGTXCCC TAGTCTTGAT TCTCTCACTC 
10151 TAAGTTTCCT GGAGAATCTG AAGTGTATTG GTGGAAGTGG TGCCAAGGAT G AGGGGAG CA ATGAAATATC 
10221 TTTCAATAAT ACCACTGCAA CTACTGCTGT TCTTGATCAA TTIGAAGTAT GCITTGTACA TATTCCATTA 
10291 TTTATTTAAT TTCCTTmT ATTTGCAATA TTCTATAAAT AATACATTTT ATACCCACTA TACTAAGATA 
10361 ATAATTACCT AGAGGGATGG ATGCTATGAC ACAGCTGCTA CACTTCAGAA ACTCTARTAA GGGCAGTTAT 
10431 GGAAGTTCAA TAAAATGATA ATGGCATCTT TTGATGGGTA ATATAGGCAA TTTAAGmT ATTTCTGTTA 
10501 AAGCAOrrATT TAGCAAGTAC TGGCCAGTAG GAGAGGAGAA TATCACCTIT TGTGAAAATC TGGTCATTGT 
10571 ACCCAGAATT TAGTTAAATG TAACATTTTA GATATTAGGG GTTATCAGGT GACAGATATT GTAGAATA6A 
10641 ACAATATGTA ATATTACCCA AAAC TATITT TTCTAAGGTT GCICIGTTAA ATATGTGCTT TC TTGATITC 
10711 A1TGAATITG CATTCCTATA TTITAGGTGG TAAAGIXSATT GTCTCTTCAA TAAATCCCGA AATTnTTAR 
10781 TT AAAAAAA A AAAAAACAAA AGTAAATTTT TGATATG GAG A GCACT GGTA TCATIT AGTA TATAAAAA AC 
10851 AGATITTGAA TTAAGTrTCT TATATAAAAG CTGTGTATAT AGrXTAATTA GTnTACATC ATTTTTCCAT 
10921 GTGGTGTTGC AGTTGTICT G A AGCAGGTGGT GTTTCTTGGA GCTTATGCCA ATACGCTAGA GAGATAAAAA 
10991 TAGGCAACTG CCATGCATTB TCAAGTGTGA TTCCATGTTA TGCAGCAGTA CAAATGCAGA AAGCTT 
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MSDPTGIAGAIINPlAQTALVP\nT)HVQYMISCRKWRVMQMI<MTElJ^SRlSVEEHlSRrfmNHLQlP 

SQTKEVVmQ\^GIRANVENFPIDVITCCSWlRHKLGQKAFKITEQIESLTRQLSUSWTDDPV7LGRVG 

SMNASTSASl^DDH=SREKrFTQALIA!^PNQKFHMVALCGMGGVGKTRMMQRLJ<KA?Ee<K^^ 

GAVITEKTOPFAIQEAlADYLGIQmEKmPARADKLflEWFKKNSDGGKTKFUVmDNWQLVDLEDI^ 

SPFPNQGVDFKVLLTSRDSQVCTMMGVEANSIIIWGaTEAEAQSLFCK3F\^ 

CGLPIAIKTmC?IJ^NKRKDAVVKDAI^RIEHYDIHNVAPK\FErSYHNn.QEEEmSTFlJ^ 

PTEElJVIRYGWGLKUT)RVYT!RB\RTRLhrTCIERLVQTNmESDDVGCVKMHDLVRA^ 

ASIVNHGNMPGWPDENDMI\^SCKRISLTCKGMiaPVDU<FPKLTILKLMHGDKSmFP^^ 

HVISYDKMKYPIJPWPRCSTNIRVmLTECSlJ<MFDCSSIGNLSNLEVI^FANSHIEWLPS7VR^ 

RLlJDmFCDGIJ^IEQGVIJ<SF\/KL£B=YlGDASGRDDNCNEhMERSYNt.SAl^^ 

NLERFKISVGCSFDENINMSSHSYENMLjQLVTTJKGDVmSKLNGIJ^lOE^ 

T>^PTQSSSFCNUCVUISKCVEmYLJ^<lJ4LA^m^RLH^LEVCECENMEEUHTGIGGCGEE^rr^ 

LSL5QLPKLSSLCHNVNIlGIJ>HLVDUlJ<GIPGFIVlYPQNKLi^TSSLiXEGWIPKLErLQIDDMENLEE 

IWPCEL5GGEKVKlJVMKVSSCDKL\«^LFPRNPMSLLHHLEe.TVENa5SIES!^ 

KSLiBSINVENLGKLflEVWRIKGADNSDUNGFQAVESIKIEKCKRFRNIFTPrrANPrt^LLEIQIEGCQ 
GNHESEEQVTLSISL5 
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BLG3 (r^ HU23) 
[Strand] 

1 AATGGCAAAA GAAGTCGGAG CAAGAGCTAA GTTAGAGCAT CTATTTGACG TCATTATCAT GGTAGATGTC 
71 ACrCAAGCAC CCAACAAGAA CACAATTCAA AGrrAGTATTT GAGAACAGTT GGGATTAAAA CTGCAAGAAG 
141 AGAGCTTGTT GGTAAGAGCA GCTAGGGTAA GTGCGAC5GTT AAAAATGCTT ACAAGGGTGC TGGTGATATT 
211 AGACGATATA TCGTCAAGGC TTGACATGGA GGAACTTGGCS ATTCCCTTTG GATCAGATAG ACAACACCAC 
281 GGCTOCAAAA TCTTGTTGAC TTCAAGAAGT ATTAGTSCTT GTAACCAGAT GASAGCTGAT AGAATCTTTA 
351 AAATACGAGA AATGCCACTG AATGAAGCAT GGCTTCTTTT CGAAAGAACA GCTAAAAAAG CTCCGAATCT 
421 GCATCAAGTA GCAAGAGATA TCGTGGAGGA GTGTQGTGGG C 
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1 G AATTCGCIC TTGGTAAGAC AACTCTTCCC TCTTCTGnT ATGATGAAAT CTCTAGCAAG TTTCATCCnT 

71 GCTOTTTCT AAAAA TATCT GGGAGGWVTC AAC3TAATAAA GACaSTATAG AAAGATTGCA AGAAAAAATC 

14X ATTTGTCATG TTTTGAAACA AGAGCAAGTG GGCGTAGOCSA GAGriTGAAGA AGGAAAGCGC ATCATAAAQG 

211 ATAGG TTACA ACATAGAAAG GTATTOATTG TCCITGATCA TGTCGACAAC GTIGAGCAGC TAGCIAGAAC 

281 AGTTGGCTGG ATCACATGAT TGCnTrGGTC AAGGTAGCOG CATAATAATC ACAACrAGftG ATGAACATCT 

351 ATTAAT TGCA CACAAAGTAG ATGTGATACA CAATATAAGC TTOmAACA ACGATGAAGC TATGCATCIC 

421 TTCTDCAAGC AAGCACXyiCG GGGTCACAAA CGTATACAAG ATTATGAGCA ACTTTEAAAA CAroiGGlTr 

491 CTIATGCTGG TGGGCTTCCA CTAGCACTGT CGAC 
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RtJGa.-EL69 



[Stxandl 



I ATCGTAACC3 T TCGTjxCGA G ANCGCTCTCC CTCCTTCMC TTTTGrCATA TGTCATAITC TCMUNATH^ 
71 TCCCACATirr AATnrciGG TTATTITAAA TTAATITnA TTCCACATGT CATTTTATGA GTmTCTAT 
141 TnariGAOT TICRCATAAT KSTMKVS? AATAACAMA AATCCATMT TATmrCTT TAAATAAACG 
211 CATATAATAT ATAGAT TAAA ATCATATAAT ACATAQtSTTA AACICATATA ATACATATGT TCATCCCCAG 
281 irTATTTATA TGTCTCATCC TTAATITATT TATTATTTAT TPATTAGAGT AGATGATCTT TCTCATATTA 
351 AAAATTTAAT TTGTTCAAAA TTTAA AATTA TTAATAATCC CACAATTTGA ATAAAATTAA AAAAAATCGN 
421 CCC ACCATTA GTCCATCACT TTTTCAGCTC ATCAATATCG TCAGTATTCT CCTTCGTrrC CACCCTAATC 
491 AATATTTCCA GCGAATCSACA CSACTCCTACG GCGTTTCTBA ATTTGCCnTC CGACACTGTT CATICAAGGA 
561 GATAATAAAT CAAATGG5AGC TOCTCCAATC TTCATTGCTO ATCAAAGGTG AATTOTATCT GAAGANAATC 
631 TCAGCGATCrJ ATCTCCATCC GGAACCCACC ACATTATCAG TGTACCACCA AACCACICAA AACGGYGGAA 
''^^ WRKAAAGTCA TGAAGAATAG ATTATnTTG TCCTCATCGG CTCACTCAGG AGCGGOTTA 
771 GTTCATCATr ^TTCTITGAN CAAAGA ATTA TCGCTICCATC GAATnTTAC ATCGACAAAG AAGTITCACT 

'^^"^55^^'^'°^ TIurrAAACA A'm 'lTAATC ' I ' mTAlVlT TTCGTIGAAA CTCCXCAATT GCAACTIGCA 
911 ACTTOCAACT TTTGGGCCCA CAAATnGTS C5T0GGCGTIA ATTTAATCCA CATATICACT CTAAACAATA 
981 ATTCAAATCG ATCTCTGTTC ATCCAATTCA TCAACATCTC TTGATAATTG AAATCATICA CGCTTCATCC 
CATCTAI^ ATATTCTCTG CTCTTATCAT ATTAAACGAT QGCTCAAATC Lii - il. - n ' Xi:iV; 
1121 CCTICTTGAC ACrrGGTOTTT GAAAAGCTGG CATYTGAAGC CXTCSAAGAAG ATIGTICGCT CCAAAAGAAT 
1191 TGAATCrrS^ CTTAAGAAAT TGAAGGAGAC ATTAGACCAA ATCCAAGATC TOCTIAACGA TCCTICCCAC 
1361 AAGGAAGTAA CTAATGAAGC CCSTTAAAAGA TCGCTGAATO ATCICCAACA mGGCTTAT GACATAGACG 
1331 AC CTACT TGA TGATVTTGCA ACTSAAGCTG TTCAWCCTIGA CTTCSACCGAG GAGGCTIGGAG CCTCCICCAG 
X401 TATGGTAAGA AAACTAATCC CAAGTTOTrG CACAAGTTTC TCACAAAGTA ATAGGATCCA TCCCAAGITA 
1471 GATGATATTG CCACCAGGTT ACAAGAACTG GTAGAGGCAA AAAATAATCT TOnTTAACTr GTCATAACAT 
1541 ATGAAAAGCC AAAAATTGAA AGGTATGAGG L W i V.- mU J r AGATGAAAGC GGTACIGTCG GACCXGAAGA 
1611 TGATAAGAAA AAATTGCTGG AGAAGCTGTT GGGGGATAAA GATGAATCAG GGAGTCAAAA CTICAGCATC 
'HEEL^'^^ TGGAGTTQCTr AAAACAACTC TAJ3CTAGACT TTTOTATGAT GAAAAGAAAG 
Z^. TSiJSIE^ CnCGAACTC AOGGCrnSOC ■ ATiWiVmtJ TGATGAGTTC AGTGTICCCA ATATAAGCAG 

^fll^™^"^ CAATCTCTGA CTGGGGAAAA GAAGGACTXT GAASACTIAA ATCTGCITCA AGAAGCICT 
JS« ^^S^^^ TTAGGAACCA GCTATTT CTA ATAGTITnSG ATGATGTGTC GTCTCAAAGC TATCGTCATT 

S^JSJi^ AGTGGGCCCA TrCCTTGCGG GGTCTOTGG AAGTAGAATA ATCATGACAA CTCGG^aSa 
AGAAAGCTGG GCmTCTCA TCAAGACCCT CTGGAGGfflC TATCACAAGA TCATCCTITC 
V'^n^ CTCAACACGC ATnGGlGTA CCAAACTTIG ATTCACATCC AACACTAAGG CCACA3GCAG 

2171 AAGTGTITGT GAAGAAATCT GATGCCTTAC CTCTAQCYTT AAGAACACTT GGAAGC7ITAT rMJGGhCJMi 
11*^ iJ£^5E^ GAACAATGGA AGGAGCTCTTT GGATACTGAG ATATGGAGGT TAGGAAAGAG CGATCAGATT 
.^.^^" "^"^ TTAGACTAAG CTACAATGAT CTITCIGCCW CITTGAAGCT RTTRTTTOCA TArTCClCCT 
23B1 TUmCCC^J, GGACTATGAG TTIGACAAGG AGGAGTTGAT TCTATTGTGa ATGGCAGAAG GGnTTlGCA 

^M.?^^ ^^yy^ CAAAGCAACG KriGC^ll.T r GRATATTTrR AACaVGTTRTT GTCAAGR1TO 

ill ATGC TCCTAA TRRCAAATCS TTOnTClGA TGCATGACCT AATGAATCAT TIGGCTACAT 

II J: ItXS^X^ AGAATITITr TCAAOGTTAG ACATAGAGAT GAAGAAGGAA TTTAGGATSS AATCTTTOGA 
ItV: ^^a^^f^ CATATCTCAT TTGTATGTGA GRATTACATA OCTPACAAAA RGTICGAGCC ATTTAGAOGA 
2731 GCTAAAAATT TCAGAACATT TTTAGCATTG TCTCTTCGGG TOGTAGAAGA TTOGAAGATC TriTACITAT 

IIV: ^fSJS?: ^T^*^^ ^^^^^^^^ 

29« SI^S^^ '^^ ^ TATCAASCAC TTGOQCnATC TTAATCTATC WGRAACTIWA 

VJt\ IKTCTOCAAT CTTTATWVTr TACARACCCT GATOTCTCT GGCTCTCAMT 

SII^^ KTIGCCCAAR ACCTICTCAA ASCTTAAAAA TTTOCASCAT TITGACATCA GGGirrACICC 
3081 KAAJCTTRAAR AACATCCCCT TAROGATTOG TGARTTGAAA ARTCTACAAA CTCrCTTYMG TAACATIGCX! 
CCGAGCTTAA GAACTIGCAM AAYCTCCMG GGAaSStc 

^g^^ ICTXGATGC ACGTTAA GCG AACTTCrrCTC A:AAAAAGGT TWAaSt ScTOGR 
III. ^SS:r:Lr ATOAATTTAA TGimCCCA AATGGGAACA CTIGAAAAAA NAAGGTCCTC AATGAATIGA 

^?°^^r^ ATGOTAVrcy AAMWAARRRY YYWCARWWAT TWMSKAWRRK GKGTTYATRR TKTIMYRAAW 
3431 WAGRGTWiy:?; KARGTAGGTr TCATCC3ATC ACCCAAffTOG GAAAATAGAT GATATTTTCA GGgStaCTO 

3 li ISSS^ ''''Sii!?^ GGCAACTAAG GTrcrXTATGA ATITAAAGAA GrroGATITA 
^r^J^ GAGTTTAGGG GAGAAAAAGG AGGATAATCA TAATATTAAT AGIGGGAGCA 

III] S^S?^-^ TnTAGGAGG TPSAATSTAT GGAGATCTAA CAOCTIGGAG CATIGCaGGT GTCCAGATAG 

ll^r ^V^f:^ T^^'^ ACATCTGrTGA TTCAATTJACA TCXG lV l Wr TCCCAACAGG AGGAGGACAG 

III. CACTTACCAT CACTGATTGC AAGAAGCTTT CGGAAGAGGA GTTCGGAGGA CGAGAGACGA 

III ^ii^SJ^ TATAAACTCA AAAATGCAGA TGCTTGAATC AGTAGATATA arrAATTOGC CAAATCTCAA 
II] ^ ^TTGACTT GCnCATTCA CCTGAACAGA TTATATATAT CAAACIUTCC GAGTRTCGAG 

40SI TCATTTCCTG ACCATGAGIT GCCAAATCTC ACCTCCTTAA CAGATCGAAC GAGAGGACAG CGATTITCCT 
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4131 A0C5AACGGTT ACGATTCGAC T GG CCGTCG T TIT 
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Further Characterization ofRG2 Family Members: 

Further sequencing of cloned RG2 polynucleotide sequences, as discussed 
above, identified additional RG2 species, listed below. Additionally, further sequencing of 
the 5* sections of RG2 sequences listed above resulted in modified and/or new sequence 

5 information, also listed below. The AC15 sequences found in the 3' sections of RG2 
family have not changed. 

Listed below are: four fiill length species, RG2A, RG2B, RG2C and RG2S; 
two near complete, but with a gap in the largest intron, RG2D and RG2J; three nearly 
complete RG2 gene sequences, RG2K, RG2N, and RG20. The deduced translation 

10 products (polypeptides) encoded by these RG2 species are listed below. The 

polynucleotide sequences do not contain any gaps (as with some of the polynucleotide 
sequences), because all of the gaps in the sequences are in introns, i.e., there are no gaps 
in exon, or coding, sequences. 

They include: an RG2A polynucleotide sequence (SEQ ID NO:87) and its 

15 deduced polypeptide sequence (SEQ ID NO:88); an RG2B polynucleotide sequence (SEQ 
ID NO:89) and its deduced polypeptide sequence (SEQ ID NO:90); an RG2C 
polynucleotide sequence (SEQ ID N0:91) and its deduced polypeptide sequence (SEQ ID 
NO:92); an RG2D polynucleotide sequence (SEQ ID NO:93) and (SEQ ID NO:94), and its 
deduced polypeptide sequence (SEQ ID NO:95); an RG2E polynucleotide sequence (SEQ 

20 ID NO:96) and its deduced polypeptide sequence (SEQ ID NO:97); an RG2F 

polynucleotide sequence (SEQ ID NO: 98) and its deduced polypeptide sequence (SEQ ID 
NO:99); an RG2G polynucleotide sequence (SEQ ID NO: 100) and its deduced polypeptide 
sequence (SEQ ID NO: 101); an RG2H polynucleotide sequence (SEQ ID NO: 102) and its 
deduced polypeptide sequence (SEQ ID NO:103); an RG2I polynucleotide sequence (SEQ 

25 ID NO: 104) and its deduced polypeptide sequence (SEQ ID NO: 105); an RG2I 

polynucleotide sequence (SEQ ID NO: 106) and (SEQ ID NO: 107), and its deduced 
polypeptide sequence (SEQ ID NO: 108); an RG2K polynucleotide sequence (SEQ ID 
NO: 109) and (SEQ ID NO: 110), and its deduced polypeptide sequence (SEQ ID NO: 111); 
an RG2L polynucleotide sequence (SEQ ID NO: 112) and its deduced polypeptide sequence 

30 (SEQ ID NO: 1 13); an RG2M polynucleotide sequence (SEQ ID NO: 1 14) and its deduced 
polypeptide sequence (SEQ ID NO: 115); an RG2N polynucleotide sequence (SEQ ID 
NO: 1 16) and its deduced polypeptide sequence (SEQ ID NO: 117); an RG20 
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polynucleotide sequence (SEQ ID NO: 11 8) and its deduced polypeptide sequence (SEQ ID • 
NO: 119); an RG2P polynucleotide sequence (SEQ ID NO: 120) and its deduced 
polypeptide sequence (SEQ ID NO: 121); an RG2Q polynucleotide sequence (SEQ ID 
NO: 1 22) and its deduced polypeptide sequence (SEQ ID NO: 123); RG2S polynucleotide 
5 sequence (SEQ ID NO: 124) and its deduced polypeptide sequence (SEQ ID NO: 125); an 
RG2T polynucleotide sequence (SEQ ID NO: 126) and its deduced polypeptide sequence 
(SEQ ID NO: 127); an RG2U polynucleotide sequence (SEQ ID NO: 128) and its deduced 
polypeptide sequence (SEQ ID NO: 129); and RG2V polynucleotide sequence (SEQ ID 
NO:130) and its deduced polypeptide sequence (SEQ ID NO:131); and, an RG2W 
10* polynucleotide sequence (SEQ ID NO: 132) and its deduced polypeptide sequence (SEQ ID 
NO: 133). 

Characterization of New RG Family Groups and RG Species: 

Further BAC insert characterization and sequencing, as discussed above, 
15 identified new RG polynucleotide sequences. The new sequences were characterized as 
belonging to new RG families; designated RG5 and RG7. These RG polynucleotides 
sequences, and their predicted translation products (the polypeptides which are encoded by 
these sequences) are summarized and listed below. 

Identified and listed below is an RG5 family member, designated as the RG5 
20 polynucleotide sequence set forth in SEQ ID NO: 134, and its deduced polypeptide 
sequence (SEQ ID NO: 135). This sequence contains an NBS region sequence. 

Also identified and listed below is an RG7 family member, designated as the 
RG7 polynucleotide sequence set forth in SEQ ID NO: 136. No deduced polypeptide 
sequence is given for the new RG7 family member as this sequence appears to be a 
25 pseudogene. 

RG2A polynucleotide sequence (SEQ ID NO:87) 

AAAGTTCATATCCAAGCTTGCCCTCCAACTCTAGCTCCTTCAATGGCACC 
TCCTTCTCTTCAAAAGCACACAAGAACACTTTCAAGCTCAACCACACTCA 
30 CAC.AAGCTCTAGAACGAGGGTTAGGGCACATTTAGGGTTTTGCTCTCTGG 
AAATGGTGTCTAAAAGTGAGGCCATAATGTTCCTTATATAAGGCTCACTC 
CCACAATTAGGCTTTCAATCTGAACGTANTACGCCCAGTGTACACTATGG 
TACGCCCAACGTACTCGGTAGTCTCCGCGTCAANAATACACTCATGAGTA 
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cgcgcaacgtactttcccttacgcccagcgtactcaaaagccaaacattc 
ttttcaaggactaattttgacaacttgaggaaagaaaaggatcaaagana 
tatacttgaattccgggatgttacaatgaagttganaccttggctaaaaa 
attaaattggttgtggaagccgttggctgagcaagcaacaagggtaaaat 

5 tcgtaatcracaaatggtgttattttctatttcttcttattattttaot 
gatttacgggtagtttttttttcttacaaaaaatattaaagttgataaag 
tatagccactaaaattgactttttccaaaacataatgtcaaatggtgcgt 
atatgtatcatgttgtattanataatgaatatgatgatnctgttctattt 
aanccgaaaaaattatctaatgattttatattggaaaacaaagttgtgat 

10 ttttngcataatataatcaaatccncttttgtntgggaggtggataaatg 
tggtaaatttanaacaagtgttttnacnttgaagggtntggaaaggttga 
aaaaagttaaaatgataaaatgtttacacaaatgttgtatccgactgaat 
atnatgtttaaggatnattgtattaaattgttgatatatagtaagcataa 
atatttagaattgtgacttaaatttataagttatncnaactggattgaaa 

15 catttttgatatanattaggaatgaaaatgagcaaccctaacatacttat 
ctttggtagtttggttattatatttttattanaatatagaancatccctt 
tattttaaacccatattgtggacggacttgaataaatgggaaaaatgtac 
cttgctatttagcacaaaaaaattataaaaatgtacattgctatttagca 

20 taa.aatgggaaatgtgttgctatttaatgcactaaaagaaactattttgc 
ctttattaaaccgggtaaaccaatagaaaaatggaagtacattgtcattt 
agcatgaaaaaaaataactttccattttttgcatccggtcacaataatag 
aaa.^atgaaagtacgttgctatttagcgaaactaacttccttttttcttt 
ttggcatcgtatcataaaatatagactaaaatacgttagttttacatttt 

25 taatacattgaaatgtctaatccacatgttattctataaaaagggaaatg 
taatttacttattcrttgattctttggcttctttttagtacccaaaacat 
ccctctatccatctattccaactaaaataatgaaaactatattccttcca 
ttgtagggatgttataaattttgtaattgtttttatgcaaaaaagtgttt 
tttgttaactagattaacgagattcatttttcagcattttaggagaagtt 

30 catccatcttttggatatgaagtgcaagccaagttctttaacatggaata 
tgaggtccctatatgctcaaaaAatagcaaatgagaaattttttaaattg 
gatccccataaaagaaaatttgttaatggttgttttaatattggtcaatg 
tgtccaccggatgagcataatactagtttataaggggtaaaggtgggttt 
ggtgggcccatttatctttattatttctaaaagtcagaattaagtaaaaa 

35 aaattataagataaataccataaggataaaaaatcattttatttggacca 
aagaccaaagttgttaaggggctgtttgttttttttgtgaagagctgtgc 
aaccacttttgtctgcgccgcacagacaacgtgcagacatatgccctcgc 
agagtgtttgttttttgaaagtgcgcagaccaaaaaaacgtctgcgcgag 
gtcatcctggcgcatatatgtgtcactgtcttcaaaggtcttcagacctc 

40 attttaaccaaaaaaaaaaaagaccaccggttttttttttttttt^ 

tttctcttgtagctgaaaatgcatttttaatctttatgacatgaaattaa 

gtttgaaaaattaatttatttcaacagctgtagacgttaaaaacaaacag 

tcttcttgttgcagactgtggacatttggtccacctcttctaccgcagag 
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ACTTGCAGATGTGGTCCGCAGACTGCAGACATTTTGGCTTCAAATAAACA 
AACATCACCTAATTTGACTACACCACACGGACCTCCAATGTAACAAAAAA 
AAGGTTGAAACAAAGTTGCCTATTTCTCCATATCCAGGGGCCATTTATGT 
AAGAGTTATCTAAATTTTAGTTCGGTAGATCAGTTCTCACATTTTAACCG 
5 GGT-AAAGTGTATGTGTGTACGCGCGCACCTGAAAGGTTTGAANGTAACTT 
CCA.\ACTGAANCAANAATCGATATGAAGTATCAAGTTAGAGGTTCAATTG 
GTGAAGGAATCAGCTGGAGGTTGGGGAATCGAGCTTCCACTATTAAGGTA 
AAATCCATAACCCTAAATGTTGGTACGCTCATATATCAAATTGCGTGTTT 
TGTTGAATGAAAAAAGCATGCTCAAAAAACCAGTGTAAGGCACGGTATAT 

lb GACATATTTATAGTTACTGATAACAAATTATGATAATTTTGGGTTTACGT 
AAGTTAGGATTCGTACTTCAACCAAATGTAATAGTTTTTGTGAGTCTATC 
TATGTATTTGGGGAATCACATTAGCAACGGGATTGTACTAGTAATTCGAA 
AAAGTCTTTTAAATAATTTTTCTGTTTATAATTTATGAATAGTTTTAGCG 
ACATCTAATATTAAATAGAATGTATCTGATATTGAATTAATGTCCTTAAT 

15 GTG.AACATAGACCTTTTCCATTTACTAATGCCTAATTATTAGTTTCTAAT 
CAATAAATTTTAATTTCTGTTTTATGCTTCTAAGACAATAAAAATCCATG 
ATTTACCTTTAAATATTAACAAAAATGACCATAAATAAATAAAAAATTAG 
GATACCAAACCCCCCCGCCATGCCCAATGTCTAAATATTCTTGATGCTTT 
TGCTTTTCCCTCTTTTCCTTGTTAGTCTATTATTCTGGAGAGTTTGAGAG 

20 AGTTTCATACAAGAAAATTTCAAGAAGAAAGCAAAGGTCCAGGTATTCTC 
TTTTCTTAATTATGTATTAACTTACAAGCATTTTTTACACGATCCATGGT 
TTTTTGTGTATGTTTTTCAAATTGAAACTAGATTGGGACTTTTGCCCTTG 
ATGATTCATAAGATATTGCATGGAGTTGAGATTGTGTAAGAAAAGTGGTG 
AATAGAAAGAGCAAGTGAATCCAGATATAGTATTGGTAATATATGATGAT 

25 GAGATAGAGATATGTTAAAACTGGCTAGAAAATTGTTTTAATTTGAAATT 
TAGGTTGTTGAATTTGAAAGATACCAAGCTAATAACTAATTAGTTATGCT 
AAATAGTTATAAAGAACAACAAACTCGTAGTTTTTTTTTCATGATTTTCA 
ACCTCTTCGTACCAAACTAAATTATAACAAAATTGAATATCATTCTCTGC 
AATCAATTTTAACTTTTGTTATTATCATCATGTCTAAAATTGCCACAAGT 

30 TTATTTTCATAGTCATATTGGATTATGAAAGGACTATTTTTACCAATTAC 
ATCTTTACTTTATGGCCAAAGCTAATACAATCCGACTAAACTAAAGGATT 
CTAGGATGCATATAGTTTGCTCCCCGATTATAGATTTCTATCTAATTTGT 
CTATTGTACTAATTTAGGTGCCACCACAAGTAAATTCCTGAAATGGATGT 
CGTTAATGCCATTCTTAAACCAGTTGTCGAGACTCTCATGGTACCCGTTA 

35 AGAAACACATAGGGTACCTCATTTCCTGCAGGCAATATATGAGGGAAATG 
GGTATCAAAATGAGGGGATTGAATGCTACAAGACTTGGTGTCGAAGAGCA 
CGTGAACCGGAACATAAGCAACCAGCTTGAGGTTCCAGCCCAAGTCAGGG 
GTTGGTTTGAAGAAGTAGGAAAGATCAATGCAAAAGTGGAAAATTTCCCT 
AGCGATGTTGGCAGTTGTTTCAATCTTAAGGTTAGACACGGGGTCGGAAA 

40 GAGAGCCTCCAAGATAATTGAGGACATCGACAGTGTCATGAGAGAACACT 
CTATCATCATTTGGAATGATCATTCCATTCCTTTAGGAAGAATTGATTCC 
ACG.\AAGCATCCACCTCAATACCATCAACCGATCATCATGATGAGTTCCA 
GTC.\AGAGAGCAAACTTTCACAGAAGCACTAAACGCACTCGATCCTAACC 
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ACAAATCCCACATGATAGCCTTATGGGGAATGGGCGGAGTGGGGAAGACG 

ACAATGATGCATCGGCTCAAAAAGGTTGTGAAAGAAAAGAAAATGTTTAA 

TTTTATAATTGAGGCGGTTGTAGGGGAAAAAACAGACCCCATTGCTATTC 

AATCAGCTGTAGCAGATTACCTAGGTATAGAGCTCAATGAAAAAACTAAA 

5 CCAGCAAGAACTGAGAAGCTTCGGAAATGGTTTGTGGACAATTCTGGTGG 
TAAGAAGATCCTAGTCATACTCGACGATGTATGGCAGTTTGTGGATCTGA 
ATGATATTGGTTTAAGTCCTTTACCAAATCAAGGTGTCGACTTCAAGGTG 
TTGTTGACATCACGAGACAAAGATGTTTGCACTGAGATGGGAGCTGAAGT 
TAATTCAACTTTTAATGTGAAAATGTTAATAGAAACAGAAGCACAAAGTT 

10 TATTCCACCAATTTATAGAAATTTCGGATGATGTTGATCCTGAGCTCCAT 
AATATAGGAGTGAATATTGTAAGGAAGTGTGGGGGTCTACCCATTGCCAT 
AAAAACCATGGCGTGTACTCTTAGAGGAAAAAGCAAGGATGCATGGAAGA 
ATGCACTTCTTCGTTTAGAGCACTATGACATTGAAAATATTGTTAATGGA 
GTTTTTAAAATGAGTTACGACAATCTCCAAGATGAGGAGACTAAATCCAC 

15 CTTTTTGCTTTGTGGAATGTATCCCGAAGACTTTGATATTCTTACCGAGG 
AGTTGGTGAGGTATGGATGGGGGTTGAAATTATTTAAAAAAGTGTATACT 
ATAGGAGAAGCAAGAACCAGGCTCAACACATGCATTGAGCGGCTCATTCA 
TACAAATTTGTTGATGGAAGTTGATGATGTTAGGTGCATCAAGATGCATG 
ATCTTGTTCGTGCTTTTGTnTGGATATGTATTCTAAAGTCGAGCATGCT 

20 TCCATTGTCAACCATAGTAATACACTAGAGTGGCATGCAGATAATATGCA 
CGACTCTTGTAAAAGACTTTCATTAACATGCAAGGGTATGTCTAAGTTTC 
CTACAGACCTGAAGTTTCCAAACCTCTCCATTTTGAAACTTATGCATGAA 
GATATATCATTGAGGTTTCCCAAAAACTTTTATGAAGAAATGGAGAAGCT 
TGAGGTTATATCCTATGATAAAATGAAATATCCATTGCTTCCCTCATCAC 

25 CTCAATGTTCCGTCAACCTTCGCGTGTTTCATCTACATAAATGCTCGTTA 
GTGATGTTTGACTGCTCTTGTATTGGAAATCTGTCGAATCTAGAAGTGCT 
TAGCTTTGCTGATTCTGCCATTGACCGGTTGCCTTCCACAATCGGAAAGT 
TGA.\GAAGCTAAGGCTACTGGATTTGACGAATTGTTATGGTGTTCGTATA 
GAT.\ATGGTGTCTTAAAAAAATTGGTCAAACTGGAGGAGCTCTATATGAC 

,30 AGTGGTTGATCGAGGTCGAAAGGCGATTAGCCTCACAGATGATAACTGCA 
AGGAGATGGCAGAGCGTTCAAAAGATATTTATGCATTAGAACTTGAGTTC 
TTTGAAAACGATGCTCAACCAAAGAATATGTCATTTGAGAAGCTACAACG 
ATTCCAGATCTCAGTGGGGCGCTATTTATATGGAGATTCCATAAAGAGTA 
GGCACTCGTATGAAAACACATTGAAGTTGGTTCTTGAAAAAGGTGAATTA 

35 TTGGAAGCTCGAATGAACGAGTTGTTTAAGAAAACAGAGGTGTTATGnT 
AAGTGTGGGAGATATGAATGATCTTGAAGATATTGAGGTTAAGTCATCCT 
CAC.AACTTCTTCAATCTTCTTCGTTCAACAATTTAAGAGTCCTTGTCGTT 
TCA.\AGTGTGCAGAGTTGAAACACTTCTTCACACCTGGTGTTGCAAACAC 
TTTAAAAAAGCTTGAGCATCTTGAAGTTTACAAATGTGATAATATGGAAG 

40 AACTCATACGTAGCAGGGGTAGTGAAGAAGAGACGATTACATTCCCCAAG 
CTG.\AGTTTTTATCTTTGTGTGGGCTACCAAAGCTATCGGGTTTGTGCGA 
TAATGTCAAAATAATTGAGCTACCACAACTCATGGAGTTGGAACTTGACG 
ACATTCCAGGTTTCACAAGCATATATCCCATGAAAAAGTTTGAAACATTT 



wo 98/30083 



107 



PCT/US98/00615 



AGTTTGTTGAAGGAAGAGGTAAATATAAATTTTTAATGCTAATACATTAC 
AAAGGATCTTTTCAGTTAAATCTTTCAAAATATATTGTAATTTGATTGTA 
TGGGGTATTATTGTTGGATGGGACTATTAATAAATGATTATCTTGCAGGT 
TCTGATTCCTAAGTTAGAGAAACTGCATGTTAGTAGTATGTGGAATCTGA 

5 AGGAGATATGGCCTTGCGAATTTAATATGAGTGAGGAAGTTAAGTTCAGA 
GAGATTAAAGTGAGTAACTGTGATAAGCTTGTGAATTTGTTTCCGCACAA 
GCCCATATCTCTGCTGCATCATCTTGAAGAGCTTAAAGTCAAGAATTGTG 
GTTCCATTGAATCGTTATTCAACATCCATTTGGATTGTGTTGGTGCAACT 
GGAGATGAATACAACAACAGTGGTGTAAGAATTATTAAAGTGATCAGTTG 

10 TGATAAGCTTGTGAATCTCTTTCCACACAATCCCATGTCTATACTGCATC 
ATCTTGAAGAGCTTGAAGTCGAGAATTGTGGTTCCATTGAATCGTTATTC 
AACATTGACTTGGATTGTGCTGGTGCAATTGGGCAAGAAGACAACAGCAT 
CAGCTTAAGAAACATCAAAGTGGAGAATTTAGGGAAGCTAAGAGAGGTGT 
GGAGGATAAAAGGTGGAGATAACTCTCGTCCCCTTGTTCATGGCTTTCAA 

15 TCTGTTGAAAGCATAAGGGTTACAAAATGTAAGAAGTTTAGAAATGTATT 
CACACCTACCACCACAAATTTTAATCTGGGGGCACTTTTGGAGATTTCAA 
TAGATGACTGCGGAGAAAACAGGGGAAATGACGAATCGGAAGAGAGTAGC 
CATGAGCAAGAGCAGGTAAGGATTTCAATTTCACTGTCTTAATTAATGAT 
TAAGCTCCTGCTTTTTGAATAAAAAAGGGACAAACCATTTCATGACTTAA 

20 TGTAGCAATACAAGTCATGTATAAGAGTGACCAACTCTTTTTTATTTATA 
AAATGACTACAAAATATTTTTTTTCATTAGAGATCATGTATAAATGTGAC 
TAATTTTTCATCACCTAACTTTAGTTGATAAATCTTTATAAATGTCACTA 
GTTACTTTTCAGTAAAATAACAAATTTAATAAATTATCAACAAAAAGCAT 
CAACTAAAAAAATCCCACAACCCGTAATAATTTAAAATAAAAGGATTTAA 

25 CATCTAATACGAACAATTTTTTTTCTAAACATGATTTGGACCAAATATCA 
CCAGCAACTCAAGTTTGGAATCGATTCAGCTTAAAACTTGACCAGCATAA 
TTAGATAGATGAGAGTTGAAGCTAAAGTGCCTATATAAGTTCGTTTCATC 
TTTTTTCTTGATCTTGATAGCAAGTTGAATGATTTTCTTCTTCAAAATTG 
ATA.\AAATCTACATTATAAAGAGACTAGCTTGAAAAAAAATGGTCTAGGT 

30 GGGTCTTGGGTTCTGGTAGATGAAGATGGAAGGGGAGAGTAGATTTCAAA 
GACACAACACATCCTTCATTTTATTTATTTATTATTATTATTATTTTTTG 
ATATCTTGCTCATATTTGTTACAGATATGTGAGGTCTATTAATCTTTTTA 
AATATATAAAAAAATAAATAACATAAATGAGAAAATTAAATAAAGAATAA 
ATTAATAAGGGCACAATAGTCTTTTTAGGTAAGACAAGGACCAAACACGC 

35 AACAAAAATAAACAGTAGGGACCATCCGATTTAAAAAAAATAATTAGGGA 
CCAAAAACATAAATTCCCCCAAACCATAGGGACCATTCATGTAATTTACT 
CTTACTTTTCGTTTTGTTCATATTTGGGTAACTATTTTTTTTGTACACAT 
CTAGGTAACGAACTTGTTGAAGTGTTCCCATTTAGGATGTGACCTACTAC 
AACCGATCATAATAGTCATATGTGAACACTTCCAACAACTTTATTACTTA 

40 GGTGTGTACAAAAAAACAATAGTTACCATGATGTGAACATACTGAAAAAT 
TAATTACCTTAGCAAGTTATTTTCCCATTTAGGTTGTATGGAAACAGTTC 
CGTGAGACCGTGACTTGGATGGTAGATAAATTTAGTAAACTTAACCCTTC 
AATTAACCTACCTTTTTCTTATTAACTCAATTTCAACCTAAATTCTGATT 
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CTTGTTTGAAAGTAAGTTGCATCTTTATTTTTGTATTATCTTGTTGCATA 
GGATCCTTAGCATCTTTTAATAATTTATTTGAAGGTGAAAGATCCAACTA 
TTTTTAATCTGTTGGCATTTTCCATCATTTGCAACTGTTTCTTGAAAAAA 
AAATACCTAAAATCAAAATAACCATTTTCAAATCCAAAATTATAAGAGAG 

5 AATTGTAAATGGACATGGAATCATAAATCATTAACACAGTTCAGTAAACA 
AGTTGCTAATTACATTTCTTGCTGTGCAGATTGAAATTCTATCAGAGAAA 
GAGACATTACAAGAAGCCACTGACAGTATTTCTAATGTTGTATTCCCATC 
CTGTCTCATGCACTCTTTTCATAACCTCCAGAAACTTATATTGAACAGAG 
TTAAAGGAGTGGAGGTGGTGTTTGAGATAGAGAGTGAGAGTCCAACAAGT 

10 AGAGAATTGGTAACAACTCACCATAACCAACAACAACCTATTATACTTCC 
CAACCTCCAGGAATTGATTCTATGGAATATGGACAACATGAGTCATGTGT 
GGA.AGTGCAGCAACTGGAATAAATTCTTCACTCTTCCAAAACAACAATCA 
GAATCCCCATTCCACAACCTCACAACCATAAAAATTATGTATTGCAAAAG 
CATTAAGTACTTGTTTTCGCCTCTCATGGCAGAACTTCTTTCCAACCTAA 

15 AGCATATCAAGATAAGAGAGTGTGATGGTATTGGAGAAGTTGTTTCAAAC 
AGAGATGATGAGGATGAAGAAATGACTACATTTACATCTACCCACACAAC 
CACCACTTTGTTCCCTAGTCTTGATTCTCTCACTCTAAGTTTCCTGGAGA 
ATCTGAAGTGTATTGGTGGAGGTGGTGCCAAGGATGAGGGGAGCAATGAA 
ATATCTTTCAATAATACCACTGCAACTACTGCTGTTCTTGATCAATTTGA 

20 GGTATGCTTTGTACATATTCAATTATTTATTTAATTTCCTTTTTTATTTG 
CAATATTCTATAAATAATACATTTTATACCCACTATACTAAGATAATAAT 
TACCTAGAGGGATGGATGCTATGACACAGCTGCTACACTTCAGAAACTCT 
AGT.AAGGGCAGTTATGGAAGTTCAATAAAATGATAATGGCATCTTTTGAT 
GGGTAATATAGGCAATTTAAGTTTTATTTCTGTTAAAGCAGTATTTAGCA 

25 AGTACTGGCCAGTAGGAGAGGAGAATATCACCTTTTGTGAAAATCTGGTC 
ATTGTACCCAGAATTTAGTTAAATGTAACATTTTAGATATCAGGGGTCAT 
CAGGTGACAGATATTGTAGAATAGAACAATATATAATATCACCCAAAACT 
ATTTTTTCTAAGGTTATTCTGTTAAATATGTGCTTTCTTGTTTTCATNGA 
ATTNGCATTCGTATATTTTAGGTGTTAAAGTGATTTTNTCTTCAATAAAT 

30 CCCGAAATTAATTAAAAAAAAAAAAACAAAAGTACATTTTTGATGTGGAG 
AGCACTGGTATCACTTAGTATATAAAAAGCrrGATTTTGAATTAACTTTC 
TTATACAAAAGTTGTGTATATAGTTTAATTAGTTTTACATCATTTTTCCA 
TGTGGTGTTGCAGTTGTCTGAAGCAGGTGGTGTTTCTTGGAGCTTATGCC 
AATACGCTAGAGAGATGAGAATAGAATTCTGCAATGCATTGTCAAGTGTA 

35 ATTCCATGTTATGCAGCAGGACAAATGCAAAAGCTTCAAGTGCTGACAGT 
AAGTGATTGCAAAGGGATGAAGGAGGTATTTGAAACTCAATTAAGGAGGA 
GCAGCAACAAAAACAACAAGAGTGGTGCAGGTGAGGAAGGAATTCCAAGA 
GTA-AATAACAATGTTATTATGCTTTCTGGTCTGAAGATATTGGAAATCAG 
CTTTTGTGGGGGTTTGGAACATATATTCACATTCTCTGCACTTGAAAGCC 

40 TGAGACAGCTCCAAGAGTTAAAGATAACATTTTGCTACGGAATGAAAGTG 
ATTGTGAAGAAGGAAGAAGATGAATATGGAGAGCAGTAAACAACAACAAC 
AAC.AACAATAACGAAGGGGGCATCATCATCATCATCTTCTTCATCTTCTA 
AGGAGGTTGTGGTCTTTCCTCGTCTCAAATCCATTGAACTAAATGATGTA 
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CCAGAGCTGGTAGGATTCTTCTTGGGGAAGAATGAGTTCCGGTTGCCTTC 
ATTGGAAGAAGTTACCATCAAGTATTGCTCAAAAATGATGGTGTTTGCAG 
CTGGTGGGTCCACAGCTCCCCAACTCAAGTATATACACACAGAATTAGGC 
AGACATGCTCTTGATCAAGAATCTGGCCTTAACTTTCATCAGGTATATAT 
5 ATTTCTTTAATTGGCATCATCTAATTAAGAAAGATATCATTCCTGCCAAG 
TAA.\TTTACTTCAAACACATTCACACTGGTTTCAGTCTAAGTTTATGTTG 
TTCTAGGAAGGCCAAAATGGGAAAGCAAGATAGGGAAAAATAGTGTATTT 
CAGTGGAAAGGGTATTTTAGGTATTTTCTGTCAAAAGTTGTTATTGCAGG 
CTTTTTAGTACCTGGAATCGTGTGTGGGAGGAGCATTATTATTCTGATTT 

10 GCTTGTTTCTTTATCATTTTTTCTTAGCCTCTGGAACAGCTAGAAACCCT 
TTTAATCTTTTGATTTTCAATGACAAAATTTTTCCTGTTACTACATTTGA 
TTGTTGTTCTTCATGGTTCTAAGTGAGTTATTGGCTCATCTGTTACTTCT 
TTTGATTGTTATTTTCATATCATGTTAGTCACITGAATCAAGCTTTTCTA 
TTTTCAACCAGGGCAAAAGGTCAAAAGTAACCTACTTTATGAGATCAAAA 

15 ACAGCAACCCATCGGATAACTTTTAGTTGGAGTTAATAGTTACAATTACC 
ATTGTGATTAATAATTATAATATCTTGTATTAATTCATAAAAATTGGTAC 
AGCACATATATGACATTTCAAAGGTTTTTGTTTGACATATATATGCCTCT 
GGCGTTTTCTTTATTGGACATGCAGACCTCATTCCAAAGTTTATACGGTG 
ACACCTTGGGCCCTGTAACTTCAGAAGGGACAACTTGTTCTTTTCATAAC 

20 TTGATCGAATTATATATGGAATTTAATGATGCTGTTAAAAAGATTATTCC 
ATCCAGTGAGTTGCTGCAACTGCAAAAGCTGGAAAAGATTCATGTGACTT 
ATTGTAATTGGGTAGAGGAGGTATTTGAAACTGCATTGGAAGCAGCAGGG 
AGA_\ATGGAAATAGTGGAATTGGTTTTGATGAATCGTCACAAACAACTAC 
CACTACTCTTGTCAATCTTCCAAACCTCAGAGAAATGAAGTTATGGTATC 

25 TAA-\TTGTCTGAGGTATATATGGAAGAGCAATCAGTGGACAGCATTTGAG 
TTTCCAAACCTAACAAGAGTCGATATATGGGGATGTGATAGGTTAGAACA 
TGTATTTACTAGTTCCATGGTTGGTAGTCTATTGCAACTCCAAGAGCTAC 
GCATATGGAACTGCAGTCAGATAGAGGTCGTGATTGTTCAGGATGCAGAT 
GTTTGTGTAGAAGAAGACAAAGAGAAAGAATCTGATGGCAAGACGAATAA 

30 GGAGATACTTGTGTTACCTCGTCTAAAGTCCTTGATATTAAAACACCTTC 
CAV.'GTCTTAAGGGGTTTAGCTTGGGGAAGGAGGATTTTTCATTCCCATTA 
TTGGATACYTTGGAAATCTACRAATGCCCAGCAATAACCACCTTCACCAA 
GGGAAATTCCRCTACTCCACAGCTAAAAGAAATTGAAACAMATTTTGGCT 
TCTTTTATGCTGCAGGGGAAAAAGACATCAACTCCTCTATTATAAAGATC 

35 AAACAACAGGTAAACCAGATCTTTGTTGCTTNNATAATTCTTAAACNACA 
TNTGAAAAGCTTCATGCAAGTTTTTTTNGTTATATNGTCAAAAACCGCAA 
CCTACATTTTCAGCTTTANATTTATGTACTTTATGCAGGATTTCAAACAA 
GACTCTGATTAATGTGAAGTGAATATTAAAGGTAAATTATATTTTCATGT 
TCCTAGTNGCCTATTAATTAAAGGCCITITAGTTCGNGATTTTTGGATGT 

40 ATTCTTCATGATGATGTCAATCTTCTAATACCCCATTCATTGTTTGGTTG 
AATGTTGACTCTATGTCAGGATGAATATTCAAGGGAAGAATTGTTCATCA 
TATGAAGGACATTAAAGAACATGGATGCTCTGAAGATGTTGGGAACACA 
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RG2A deduced polypeptide sequence (SEQ ID NO:88) 

MDVVNAILKPVVETLMVPVKKHIGYLISCRQYMREMGIKMRGLNATRLGVEEHVN 
RNISNQLEVPAQVRGWFEEVGKINAKVENFPSDVGSCFNLKVRHGVGKRASKIIEDI 
DSVMREHSinWNDHSIPLGRIDSTKASTSIPSTDHHDEFQSREQTFTEALNALDPNHK 

5 SHMIALWGMGGVGKTTMMHRLKKVVKEKKMFNFIIEAVVGEKTDPIAIQSAVADY 
LGELNEKTKPARTEKIJaCWFVDNSGGKKILVILDDVWQFVDLNDIGLSPLPNQGV 
DFK\TXTSRDKDVCTEMGAEVNSTFNVKMLIETEAQSLFHQHEISDDVDPELHNIG 
VNI\TIKCGGLPIAIKTMACTLRGKSKDAWKNALLRLEHYDIENIVNGVFKMSYDNL 
QDEETKSTFLLCGMYPEDFDILTEELWYGWGUOJTOCVYTIGEARTRLNTCIEWJ 

10 HTNLLMEVDDVRCIKMHDLVRAFVLDMYSKVEHASIVNHShm^WHADNMHDSC 
KRLSLTCKGMSKFPTDLKFPNLSILKLMHEDISLRFPKNFYEEMEKLEVISYDKMKY 
PLLPSSPQCSVNLRVFHLHKCSLVMFDCSCIGNLSNLEVLSFADSAIDRLPSTIGKLK 
KlJRLIJ)LTNCYGVIUDNGVIiCKLVKLEELYMTVVDRGRKAISLTDDNCKEMAERS 
KDIYALELEFFENDAQPKNMSFEKLQRFQISVGRYLYGDSIKSRHSYENTLKLVLEK 

15 GELLEARMNELFKKTEVLCLSVGDMNDLEDIEVKSSSQLLQSSSFNNLRVLVVSKC 
AELKHFFTPGVANTLKKLEHLEVYKCDNMEELIRSRGSEEETITFPKLKFLSLCGLP 
KLSGLCDNVKIIELPQ1J^LEU)DIPGFTSIYPMKKFETFSLLKEEVLIPKLEKLHVSS 
MWN1JCEIWPCEF^MSEEVKFREIKVSNCDKLVNIJTHKPISLU^HLEELKVKNCGSI 
ESLFNIHLDCVGATGDEYNNSGVRUKVISCDKLVNLFPHNPMSILHHLEELEVENC 

20 GSIESLFNIDLDCAGAIGQEDNSISLRNIKVENLGKLREVWRIKGGDNSRPLVHGFQS 
VESmVTKCKKFRNVFTPTTTNFNLGALLEISIDDCGENRGNDESEESSHEQEQIEILS 
EKETLQEATDSISNVVFPSCLMHSFHNLQKLILNRVKGVEVVFEIESESPTSRELVTT 
HHNQQQPniJPNLQEIJnLWNMDhn^SHVWKCSNWNKFFTU'KQQSESPFHNLTTra 
MYCKSIKYLFSPLMAELLSNLKHIKIRECDGIGEVVSNRDDEDEEMTTFTSTHTTTT 

25 LFPSLDSLTLSFLENLKCIGGGGAKDEGSNEISFNNTTATTAVLDQFELSEAGGVSW 
SLCQYAREMRIEFCNALSSVIPCYAAGQMQKLQVLTVSDCKGMKEVFETQLRRSSN 
KNNTCSGAGEEGffRVNNNVmi^GUaLmSFCGGI^HIFTFSALESLRQLQELKITFC 
YG^IKV^VKKEEDEYGEQ.TTIT^TITKGASSSSSSSSSKEVVVFPRLKSIELNDVPELV 
GFFLGKNEFRLPSLEEVTIKYCSKMMVFAAGGSTAPQLKYIHTELGRHALDQESGL 

30 NFHQTSFQSLYGDTLGPVTSEGTTCSFHNLIELYMEFNDAVKKIIPSSELLQLQKLEK 
mVTYCNWVEEVFETALEAAGRNGNSGIGFDESSQTTTTTLVNLPNLREMKLWYL 
NCLRYIWKSNQWTAFEFPNLTRVDIWGCDRLEHVFTSSMVGSLLQLQELRIWNCSQ 
lEVMVQDADVCVEEDKEKESDGKTNKEILVLPRLKSLILKHLPCLKGFSLGKEDFSF 
PLLDTLEIYKCPAITTFTKGNSTTPQLKEIETHFGFFYAAGEKDINSSnKIKQQDFKQ 

35 DSD.CEVNIK 

RG2B polynucleotide sequence (SEQ ID NO:89) 

TTTTTTAAGATCAGGGATTCAAATTCAGCCCTAGTGATTACAATTGTGTC 
TAA.ACTTTCCCATACCTTCACATTATTGTAAGTATACTTTCTCAGTTTCT 
40 CTCTTGGAAGCTTCCTTGGTATTTTAACTCGTGTTCTAATATTTAACTCT 
GATAGTTATTTTGGCCAATCTACTATCTGCATGTCCGGTTATTGAATCCG 
AAGGCACTGGAATCTTGGATTCCATTCCGTTGTGTGTTTGGTTGCCAAAT 
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GAACGGAATTGAATTATGTAAGATTCCTTCAAAATCCATGTTTAGGTATA 
TCGTTGTTTCTTGGGATGGATGGTAAAGAACGGAATTTCTCCTGTTCATT 
TTTTAATGAAAGACCAAATTGACCTTATAAACCTGTTAAAAAAATTACAT 
TCCAGTTTTCTTAACAAACTGAAAATGGTAAAGGAGTGTGATTGAATTCC 
5 AATCTGTTTCCTGTCCAAAACACGTGACGGAATATTACAATTCCTTCAAA 
TTTCATTTTCTTAAATTGTTATTCCCTTTCTTACAAAAACAAGGTAAACG 
AAACACCCGCTTACTTAATCATACTCCTACATGATGTAAATGAAAAGGGT 
ATAAATGGTATTTTATTCACAGGGATGAGTCACCATGGTCATGAAAGAAT 
CATTAACCGCCCTTACCCAATTCATGTTTGCCCCTAAAATATGATTTAAA 

10 GTAATATTGGCTTATGGGATTCAAGTTGACTTTTTTGTGGCGAAGAAATA 
ATGAAAATCTTCATTTCTAAAGTGTCTTCTACCACTGACATTTTCTAAGA 
AAGAACTTGCTAGAAGAAGGTGGGTTGTTTAGTCTTTTTACTCTTTAAAT 
GTGAAGACTGTTGAGTTATTATTATTATTTTGCCAACTATGGACAACTTG 
TTTAGTTTTTTTTTTTCCCCAATATCCATTTATATGCGATTTATTTCTGA 

15 AAT.AATTTTATCAAAACGCAGGAAACAATGTAGAATAATACTGGTATAAT 
TAATTATATAAAGTTATTAGGCTGAAATCTTGAGGCTACTATAATTTAAT 
TATCATAATTTGAAAATCATCAAAfrGTATTCCATGTATATTTATGTTAT 
CAGATAATTAATAATATGTGAGCCACACAAATCCACATCATCAGACACCC 
CACCTTATTGTCGGCTACCTCACCACTTGCATGATCCCGACATCTTCCCA 

20 ACCCCACCGACGACTTGGGGTCTCCTTAATATATCAATTATTTTCTGTAA 
GTATTTATTTGTGTAAATGTGTAATGTCATTTTACCTTTTTTCTAATATA 
TACAGAAACATAAATTTTAAATGAAATTCAACTGCGTTTCATTCTTGCAT 
TAA-\AAAAAAGACTGTACTGTTGTCAATATTTTACTTATAACCTGATTAA 
TTA-ATTAAAGCGTAATTGCATAATTTGCATTAGGTTGTAATTTTGTGTTT 

25 TATAGGGAGGGTGAGGGTCACCGGGAATCAAAGCACTTATGTAAAAGCAG 
GGA.\ATACAAAAAATTTACTCGAAACAAATTTTATTCAATTTAAGTGAGA 
TAATAATGTTCTGATTAGATTATGAGAACTAGGAGATTTAAGTGATATAT 
CCCATTTAAAAGAAATTGCATTATTAATTTTGGATCTCTTGATGATGACA 
AAATTAACTCGTGACAGGTTATATATCATATACAAAATGAGTGGCTATGC 

30 TTTCGCTTTCCAAAAAGCAATTATAGTTATACTACACCTACAAATTTTAA 
AAGGGGTTAAACATATCAAAATACTTGATAAGTAATTATATAAATATGCA 
TTT.\ACCCTCTAAAGAAAATGCTACTAAGCTTGGACCATCTCAGAATTAC 
AATCATACCCTTCCCCTCAAAAAAGATTCGTATATATCATGTCATTTGGC 
ATTCATTTCTTTTTCACAATTCATAGTTCTATTCTCAAAAAATTCGAGTT 

35 CTCGTATTTGTAAGGAAGATCAGAAGAGACTGTTCACACAGGTACTCTCT 
TTTATTTATTGATTCACATTCATATATGTTATTGTTTTCTTGCTTAATGG 
TTTCGTCAGTCTAACTGCGCTTGCTGATTTAAATTTCTTCACTTTCTTCC 
ACGGATTTTTTAAATATTAGTTTTGTGAATGAACAATTGGTGAAGGAAAG 
AAACATGGGAGTCTTTTCTAAAGTAAACCTAGATACTTAGGTTATAAGGG 

40 TATATGCTAAAATGAACTATGCCCATTCACCTTTGCCTTTTCTTTTACTT 
TTTAGTTTTTAGAATCCAAGTTTTCATATGTATCTCGATGTGTGAGAAGA 
ATAGGCATTAGAAAGGTAAAGGACGTACATAAAATTGATTAATTAGTGAA 
TGTTCTTTGATATCATTATTTTTACTCTCATAAAAAGCATATAGATCAAA 
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CAC.AAATTGCTACTTGTTAGTGTAACAACTTCGACTTAATAATGTTAATA 
ATC.\AGATTCTCTTGATTTCAACTATTTTCTAACCGAACAAGCTCACTAA 
AAACTCATATTGCTTTGAGTCTGAGTGGTTTATATTTGGGGTTTTACATT 
TAATTTTTTGTGCATGAATGTGAAAATAGACTGCTTATTGATTCTTTGTG 

5 TTTCATTGAGTTGATTTTCATTATTACTACCTTACAAATTGCTCAGTGAT 
AGATTTCCATTAATTTGCTAATTCGGTTGCTTCTAAATATGTAGGAGCTA 
CTA.-\AAGCAAAAATATCGAGCAATGTCGGACCCAACGGGGATTGCTGGTG 
CCATTATTAACCCAATTGCTCAGACGGCCTTGGTTCCCGTTACGGACCAT 
GTAGGCTACATGATTTCCTGCAGAAAATATGTGAGGGTCATGCAGATGAA 

10 AATGACAGAGTTGAATACCTCAAGAATCAGTGTAGAGGAACACATTAGCC 
GGAACACAAGAAATCATCTTCAGATTCCATCTCAAACTAAGGAATGGTTG 
GACCAAGTAGAAGGGATCAGAGCAAATGTGGAAAACTTTCCGATTGATGT 
CATCACTTGTTGTAGTCTCAGGATCAGGCACAAGCTTGGACAGAAAGCCT 
TCAAGATAACTGAGCAGATTGAAAGTCTAACGAGACAACTCTCCCTGATC 

15 AGTTGGACTGATGATCCAGTTCCTCTAGGAAGAGTTGGTTCCATGAATGC 
ATCCACCTCTGCATCATTAAGTGATGATTTCCCATCAAGAGAGAAAACTT 
TTACACAAGCACTAAAAGCACTCGAACCCAACCAAAAATTCCACATGGTA 
GCCTTGTGTGGGATGGGTGGAGTGGGGAAGACTAGAATGATGCAAAGGCT 
GAAGAAGGCTGCTGAAGAAAAGAAATTGTTTAATTATATTGTTGGGGCAG 

20 TTATAGGGGAAAAGACGGACCCCTTTGCCATTCAAGAAGCTATAGCAGAT 
TACCTCGGTATACAACTCAATGAAAAAACTAAGCCAGCAAGAGCTGATAA 
GCTTCGTGAATGGTTCAAAAAGAATTCAGATGGAGGTAAGACTAAGTTCC 
TCATAGTACTTGACGATGTTTGGCAATTAGTTGATCTTGAAGATATTGGG 
TTA.A.GTCCTTTTCCAAATCAAGGTGTCGACTTCAAGGTCTTGTTGACATC 

25 ACGAGACTCACAAGTTTGCACTATGATGGGGGTTGAAGCTAATTCAATTA 
TTA.ACGTGGGCCTTCTAACTGAAGCAGAAGCTCAAAGTCTGTTCCAACAA 
TTTGTAGAAACTTCTGAGCCCGAGCTCCAGAAGATAGGAGAGGATATCGT 
AAGGAAGTGTTGCGGTCTACCTATTGCCATAAAAACCATGGCATGTACTC 
TTAGAAATAAAAGAAAGGATGCATGGAAGGATGCACTTTCGCGCATAGAG 

30 CACTATGACATTCACAATGTTGCGCCCAAAGTCTTTGAAACGAGCTACCA 
CAATCTCCAAGAAGAGGAGACTAAATCCACTTTTTTAATGTGTGGTTTGT 
TTCCCGAAGACTTCGATATTCCTACTGAGGAGTTGATGAGGTATGGATGG 
GGCTTGAAGCTATTTGATAGAGTTTATACGATTAGAGAAGCAAGAACCAG 
GCTCAACACCTGCATTGAGCGACTGGTGCAGACAAATTTGTTAATTGAAA 

35 GTGATGATGTTGGGTGTGTCAAGATGCATGATCTGGTCCGTGCTTTTGTT 
TTGGGTATGTTTTCTGAAGTCGAGCATGCTTCTATTGTCAACCATGGTAA 
TATGCCTGGGTGGCCTGATGAAAATGATATGATCGTGCACTCTTGCAAAA 
GAATTTCATTAACATGCAAGGGTATGATTGAGATTCCAGTAGACCTCAAG 
TTTCCTAAACTAACGATTTTGAAACTTATGCATGGAGATAAGTCGCTAAG 

40 GTTTCCTCAAGACTTTTATGAAGGAATGGAAAAGCTCCATGTTATATCAT 
ACGATAAAATGAAGTACCCATTGCTTCCTTTGGCACCTCGATGCTCCACC 
AACATTCGGGTGCTTCATCTCACTGAATGTTCATTAAAGATGTTTGATTG 
CTCTTCTATCGGAAATCTATCGAATCTGGAAGTGCTGAGCTTTGCAAATT 
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CTCACATTGAATGGTTACCTTCCACAGTCAGAAATTTAAAGAAGCTAAGG 
TTACTTGATCTGAGATTTTGTGATGGTCTCCGTATAGAACAGGGTGTCTT 
GAAAAGTTTTGTCAAACTTGAAGAATTTTATATTGGAGATGCATCTGGGT 
TTATAGATGATAACTGCAATGAGATGGCAGAGCGTTCTTACAACCTTTCT 
5 GCATTAGAATTCGCGTTCTTTAATAACAAGGCTGAAGTGAAAAATATGTC 
ATTTGAGAATCTTGAACGATTCAAGATCreAGTGGGATGCTCTTTTGATG 
AAAATATCAATATGAGTAGCCACTCATACGAAAACATGTTGCAATTGGTG 
ACCAACAAAGGTGATGTATTAGACTCTAAACITAATGGGTTATTTTTGAA 
AACAGAGGTGCTTTTTTTAAGTGTGCATGGCATGAATGATCTTGAAGATG 

10 TTGAGGTGAAGTCGACACATCCTACTCAGTCCTCTTCATTCTGCAATTTA 
AAAGTTCTTATTATTTCAAAGTGTGTAGAGTTGAGATACCTTTTCAAACT 
CAATCTTGCAAACACTTTGTCAAGACTTGAGCATCTAGAAGTTTGTGAAT 
GTGAGAATATGGAAGAACTCATACATACTGGAATTGGGGGTTGTGGAGAA 
GAGACAATTACTTTCCCTAAGCTGAAGTTTTTATCTTTGAGTCAACTACC 

15 GAAGTTATCAAGTTTGTGCCATAATGTCAACATAATTGGGCTACCACATC 
TCGTAGACTTGATACTTAAGGGCATTCCAGGTTTCACAGTCATTTATCCG 
CAG.AACAAGTTGCGAACATCTAGTTTGTTGAAGGAAGGGGTAGATATATG 
TTCTTTATGTTAATACAATTTAAATAATATTTTCAACCAAATTTTCATAA 
TATATCTGTAATTTGATTGTATGATGTGTTATTGTTTATATGTGGCTATT 

20 AAGGGATGATTATTTTGCAGGTTGTGATTCCTAAGTTGGAGACACTTCAA 
ATTGATGACATGGAGAACTTAGAAGAAATATGGCCTTGTGAACTTAGTGG 
AGGTGAGAAAGTTAAGTTGAGAGCGATTAAAGTGAGTAGCTGTGATAAGC 
TTGTGAATCTATTTCCGCGCAATCCCATGTCTCTGTTGCATCATCTTGAA 
GAGCTTACAGTCGAGAATTGCGGTTCCATTGAGTCGTTATTCAACATTGA 

25 CTTGGATTGTGTCGGTGCAATTGGAGAAGAAGACAACAAGAGCCTCTTAA 
GAAGCATCAACGTGGAGAATTTAGGGAAGCTAAGAGAGGTGTGGAGGATA 
AAAGGTGCAGATAACTCTCATCTCATCAACGGTTTTCAAGCTGTTGAAAG 
CAT.\AAGATTGAAAAATGTAAGAGGTTTAGAAATATATTCACACCTATCA 
CCGCCAATTTTTATCTGGTGGCACTTTTGGAGATTCAGATAGAAGGTTGC 

30 GGAGGAAATCACGAATCAGAAGAGCAGGTAACGCTTTCAATTTCACTTTC 
TTA-\TTAATTAAGGACTAAGCTCCTGTTTTTTGAATAATAAAGAGGTGGG 
ATGACTAAACTTGGGCATCACAATTGCAACAAAATGTTACAAACCATGAA 
ACGTTCAAACCATTTCTTGAATTAAGGTTTCAATACAAGTCATTTAAAAA 
TATGGCTTAAATTTTTTTTATATTTATGTATCAACATGATTTTTCATTAG 

35 AGATCATTATTATAATAGTAAGTTTAAAGCAATTTAAATCAGAACTAATT 
CTAACTTTAGCTAATAAATCGTTATAAATGTAAATAATTACTTTTTAGTG 
AAATAAGCAACGGATTTAATAAGTTAACAACTTAAATGTCATTTCCTAAC 
AAAAAAAACTATTTGGTTCAGAAAAACCGTAATTCAAGATAACTAAAATA 
AAAATATTTGACATTCACTAAGAGCATTTTTTTTTCTAAATATGATTGCA 

40 AATGAATAAAACTTAAATTTATACAGAAAATTCTTTTATATATGTTATAC 
AAAATTTACAAATTGAAATTGGATATGTTAATTAACGGTTTATAATTCTG 
GTATCACAAAGGGATATATAATAAAATATTATTTTCTGTAGTCATTTGTA 
ATTGTACTAGTTTATAACCCGTGGGAACCATGAGTTCTAAAATTAGTTAA 
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ACTTTCATAATAAAAATTTATAATTATTATTTATTTTAAATAAATTATTA 
ATTAAGAGATATATCAAAAATTTAAAGTTATTATAACTTCAAATTTAACA 
TATAATTAGAAAATATATGATCATAACTTTCTGCAACTCTTCTTTTGTAT 
TAAAATGACCAGAGAAGCTCTTAGTATATTTCTAATCAAAGTCTCAAAMC 
5 TAATGAAGCATATAATTTGTGAAAATCAATTAGCATTAGGTTTTAAGAGT 
CACCAAATTCAAAGAATAATCCAATGCTTTCATTACCACTATGGAGAAAA 
TATTTTCTTAGTTTAAATGAAATGAAAACAAACATTCAAACTAATTGTTG 
CTTATTAAACCAAAGACCCATTACTTAGCCAAGAGTTTAACAAAAAAAAA 
TTACATTCATGTATCATTATTCATGACTAGATATATATGAACATGAAGGG 
10 AGTTTTTATAGAAAATATAATCATAGATATTCAACATAACTTCAGGGAAT 
TCCTCAAAATAACCAAGTTATTCAAGAAATTACATCCAAGTCAACCAAAG 
AGAAGTTTAGCCTAGCATGGCTAAACTCAAGAAACTAAAATAAGGATTAG 
AAGTACCAAACATGTAGTAAGAATCACAGTAAAAGATGATGTTGTTCTTG 
ATGTTCTTCTAAGTTCTTCAAGTCTCCAGTTGCTCCTAATAATGCAAAGG 
15 AGAGCCATTAAATTCGTATGTATTGATCCCTTCAAAAGCTGCACCAACCT 
CCCTTAAATAACACTCAAAGCAAAAATGACAAAATTGCCCCTGAAGGACC 
CTATGTGGGTGCCTTGCGCGGGTGGAGCTGCATACGAAAGGTCTTTGGTC 
TTTGTGAGGGTGATGTTGTGCGGGATAGCTTGTCGCATGCTTCCGCGCGG 
TTCACGCACATGTGCACAGGTGATGCATGGTGTGTGCGTTCTTGAGTTTT 
20 GAGCCTCCGATGCTTAGTCCACTTGGCCCAATTCGAGTCCAATCAGCTTA 
TAACCCATTTTTCTTCAAGTTATCTTCAAGTTAAGCCCAATTTGCCTTCT 
CCAAATCATCCATAACTTCACAGAATCGCCCGTTCATCTTAATCCCGGAT 
GCACAATTATTCTCCCGTCTTCATTTTAAGCAAGATACCACCTTCTTCAT 
GCTTCATCCATCAATAGTACACTTCATGTATCATCTCTACTAGTTATTTA 
25 GTCCACAATCCTTGTTGTCCTCCAAATTTAATTATCTCATTTAGTTCCCG 
TTCCGCTAGTTTCCTTAAAATTTGCAATTAAGCTCAGAGAAATATTAAGT 
ACCCGAAATGGTCATAAAATAACAAAAAGGAAAATATGCATGAAGATTAA 
CTA.\ATGATGAACGAAATATGCTAAAATAGACTATAAAATGAAGTAAATA 
AAATGAAATTATCGCACTCCGACCACCCTTATAGGCTTGTAGTCCACCCA 
30 CCCTTCATTCCTTGTACCAATATGGGATGGAAACATCATTAATTAAGCCA 
AAAAGCTAACATATAAGGGTTTAGTGACAAAGGTAAGTACTAAAGATGAA 
AAT.\ATCCATTTTTCTTGTATATACACAACACACACATAGGGGCAGACGT 
AGGATTTCAAAGTACAGATTGTTGGTGGCACATAAGTGTTGCTGGTGACA 
TTTTTTTTTTTTTTTTACGTAGTGGCACAACAGTAGGAAAAACGAAAAAT 
35 TCGAAATTTTTTACAATTTGTCTAAAAAAAACAGTGGTTGTTGGTGCCAC 
TATGGACACCAAAGTTGAACTGCCCCCACGCGCGCACACACACACACACA 
CATAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAAAGAAAGAAAGAGAGA 
GAGAGTTTGGGATGTGATACTTCTTTTAGGAAAATGGAGTTATATCTTTG 
ATATTGTATTTTTTTAATGTAATTTATATATTTAATCATTTTAGTTTATA 
40 AGTTTTATTTATTTTGATATGAAAAAAAAAGTCTTTTATACATTGGATTT 

AACATAAAAATCCAACAATATTAATCAAAAAGACCAMACATGTGGACAMW 

TATGTATATAAWTAATTCACAATAGTCTTTAGGAATAGNATTATATATAT 

AATTAATTCTCAATGGTCTTAGGAATAGTAAGTTCTTATATTTCAAACTT 
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TNGCCACAATTCTTTGKTTACTTWGACACTTYCCTCTCTCTAATTATATA 
TATATATATATATATATATATATATATACACACACACACACACACACTAG 
ATGTGTGCCCGCGCAAAGCAGTGACGTNNNGGAGAANACTTTCTTAAGCA 
TAAATAATTATTATATTTTTTATTGGGTATTATATAATAAAAAATTACAA 

5 CTTTTAAATAAAATATTTATGTTTATACTTTATATTTATATTGCTTGTAT 
ACTATTAATATAATAAATTAATATTTATGTCTAATTTATGAAATGTAAAT 
TAATTTAAATACATGAATTTAATATTTTTAAAATTTTCAGTTTGCTTCAA 
ATTGAGTTTCTTAATTATTTTTTTTAATTCANGTATTCAAACTTTTGGTA 
AGTATTAAAGAATTATTTATGCATAATTGATTTATACAAAAAACTTTGTA 

10 ACTTATACATCTTAAAATTCAAGATATAACTAACATGTTTTACAATATAT 
ATATATATATATATATATATATATATATATATATATATATATATATATAT 
TAAAGCGCAAAGGTCATAGGAATAGAATATTTTCTATTATTCTACGTTTT 
GCCACAAAAGTTTGAACACTTTGCCACTTTTTGTCCCTCCTTAACCTTTT 
CAATGTTTTGCGACAAAAGTTCCAAAACTTTGCCACTTTGATCATTCCTC 

15 AACTTTTCACCGCAATTAGTTTGTGGAGTTGGCAGTTTTGATCCCCCTAA 
CTTCGATATTCTCTACTGCTAGCCAAAAAGGGTTCCAGAGTTTCACACTT 
TTGGTCCCTGACAGTAACCAAATGTGAGATGTCAAATTTTTGCCACATTA 
GTTTGTGGAGTTGTCCCTTTTGGTCCCCCCACATTCGATATTCTANTATA 
CGACCTTATTTTTNTCAAATAACAACACGTATATTTAATTACCAATTATA 

20 GAAATAGATATCAAATAAAGTATTTGTAACACTGTGTAAGAACGGTGCTA 
CTATAGGTAAAAATAAACATTTCAAAGTACGATATCCTAATTGGAAAAAG 
AGTTTTAAAAAAATAACGACTAGGGGCGAGTTTTTTTTACAAGTTTGTAT 
CAA.ATCATATCAAAATTTAAGGTGGAACGGTGACCACATTAACCAGAAAT 
GTA.\TTTATTCTTTGATTTTGATAATTTTTAATATTTTGTTGTGATCTAT 

25 GTATTTAAAAGTAAACAACAAAGAACATAATCCAAAACCCTAAATTGCAA 
GTCTCGCCCAATTTCTCTATCACTAGTCCTCACTTACGATGGCGTTACGT 
CGCTCTCTCACTGCTTACAACCCTTTGTTGCTACTCATTACAATAACGAA 
AAGTTGAATATCCATATATTTATTTGGATGTGGAATTGAACGAATCTCGT 
CAA.AATTTTGATTTTGTTGA'TGGATTTGAGTAGAAGTTTGGGCAGAACGG 

30 GAATGATGGTCTGCAAGTGGTTATAAACTTGATTCTGAGTTATTACTATA 
TATGTAGCCTCTTTACAACGACCAAGGTTTCTTCCAGGTACCATTTGATC 
TTTTTAGAACTTAGTTTTCTGAAACACCCTGATTTGGATCAAATATCACC 
AAC-AACTCTTAAAAACTTGATTAATCAATTGTTTTCTTCATCTTGATAAC 
AAGTGGAATGATTTTCTACTTAGATTAACTTGAAAAAAAAGGTCCATGTG 

35 CGTCTGGTGGATCTGGTAAATGAAGATGGAAGGGAGAGCTGACTTTAAAG 
ACACAAACACGTCACCATATCTCTTATTTTATTTTAAATTTGCTTTTGGT 
GTATTTTCTTTTTTCCTATTTCTTTCTTTCTTGATCTCCAGATGGTATGT 
GGTGTGGATAATTTACACCTAGAGATTGGGAACGATGGGAAGGGGTCTGT 
GATTTATGGCTGGCCGAGTTTTACTTATTAACTCAATTTCAACCTAAATT 

40 CTGATTCTTGTTTGAAAATAAGTTGCATCTTTATTTTTGTATTATCTTGT 
TGCATAGGATCCTTAGCATCTTTTAATAATTTATTTGAAGGTGAAAGATC 
CAACTATTTTTTAGCTGTTGGCATTTTCCATCATTTGCAACTGTTTCTTG 
AAAAAAAAATACCTAAAATAAAAATAACCATTTTCAAATCCAAAATTATA 
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AGAGAGAATTGTAAATGGACATGGAATCATAAATCATTAACACAGTTCAG 

TAAACAAGTTGCTAATTACATTTCTTGCTGTGCAGATTGAAATTCTATCA 

GAGAAAGAGACATTACAAGAAGCCACTGGCAGTATTTCAAATCTTGTATT 

CCCATCCTGTCTCATGCACTCTTTTCATAACCTCCGTGTGCTTACATTGG 

ATAATTATGAAGGAGTGGAGGTGGTATTTGAGATAGAGAGTGAGAGTCCA 

ACATGTAGAGAATTGGTAACAACTCGCAATAACCAACAACAGCCTATTAT 

ACTTCCCTACCTCCAGGATTTGTATCTAAGGAATATGGACAACACGAGTC 

ATGTGTGGAAGTGCAGCAACTGGAATAAATTCTTCACTCTTCCAAAACAA 

CAATCAGAATCCCCATTCCACAACCTCACAACCATAAATATTCTTAAATG 

CAAAAGCATTAAGTACTTGTTTTCGCCTCTCATGGCAGAACTTCTTTCCA 

ACCTAAAGGATATCCGGATAAGTGAGTGTGATGGTATTAAAGAAGTTGTT 

TCA-\ACAGAGATGATGAGGATGAAGAAATGACTACATTTACATCTACCCA 

CACAACCACCACTTTGTTCCCTAGTCTTGATTCTCTCACTCTAAGTTTCC 

TGGAGAATCTGAAGTGTATTGGTGGAGGTGGTGCCAAGGATGAGGGGAGC 

AATGAAATATCTTTCAATAATACCACTGCAACTACTGCTGTTCTTGATCA 

ATTTGAGGTATGCTTTGTACATATTCAATTATTTATTTAATTTCCTTTTT 

TATTTGCAATATTCTATAAATAATACATTTTATACCCACTATACTAAGAT 

AAT.AATTACCTAGAGGGATGGATGCTATGACACAGCTGCTACACTTCAGA 

AACTCTAGTAAGGGCAGTTATGGAAGTTCAATAAAATGATAATGGCATCT 

TTTGATGGGTAATATAGGCAATTTAAGTTTTATTTCTGTTAAAGCAGTAT 

TTAGCAAGTACTGGCCAGTAGGAGAGGAGAATATCACCTTTTGTGAAAAT 

CTGGTCATTGTACCCAGAATTTAGTTAAATGTAACATTTTAGATATTAGG 

GGACATCAGGTGACAGATATTGTAGAATAGAACAATATATAATATTACCC 

AAAACTATTTTTTCTAAGGTTATTCTGTTAAATATGTGCTTTCTTGATTT 

CATTGAATTTGCATTCCTATATTTTAGGTGGTAAAGTGATTGTCTCTTCA 

ATA.\ATCCCGAAATTAATTAAAAAAGAAAAAAACAAAAGTAAATTTTTGA 

TATGGAGAGCACTGGTATCATTTAGTATATAAAAAAACTAGATTTTGAAT 

TAAGTTTCTTATATAAAAGCTGTGTATATAGTTTAATTAGTTTTACATCA 

TTTTTCCATGTGGTGTTGCAGTTGTCTGAAGCAGGTGGTGTTTCTTGGAG 

TTTATGCCAATACGCTAGAGAGATARAAATAGKTGGATGCTATGCATTGT 

CAAGTGTGATTCCATGTTATGCAGCAGGACAAATGCAAAAGCTTCAAGTG 

CTGAGAATAGAGTCTTGTGATGGCATGAAGGAGGTATTTGAAACTCAATT 

AGGGACGAGCAGCAACAAAAACAACGAGAAGAGTGGTTGCGAGGAAGGAA 

TTCCAAGAGTAAATAACAATGTTATTATGCTTCCCAATCTAAAGATAITA 

AGTATTGGAAATTGTGGGGGTTTGGAACATATATTCACATTCTCTGCACT 

TGA.^AGCCTGAGACAGCTCCAAGAGTTAAAGATAAAATTTTGCTACGGAA 

TGA_AAGTGATTGTGAAGAAGGAAGAAGATGAATATGGAGAGCAGCAAACA 

ACAACAACAACAACGAAGGGGGCATCTTCTTCTTCTTCTTCTTCTTCTTC 

TTCTTCTTCTAAGAAGGTTGTGGTCTTTCCTTGTCTAAAGTCCATTGTAT 

TGGTCAATCTACCAGAGCTGGTAGGATTCTTCTTGGGGATGAATGAGTTC 

CGGTTGCCTTCATTAGATAAACTTAAGATCAAGAAATGCCCAAAAATGAT 

GGTGTTTACAGCTGGTGGGTCCACAGCTCCCCAACTCAAGTATATACACA 

CAAGATTAGGCAAACATACTCTTGATCAAGAATCTGGCCTTAACTTTCAT 
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CAGGTATATATATATTTCTTTAATTGGCATCATCTAATTAAGAAAGATAT 
CATTCCTGCCAAGTAAATTTACTTCAAACACATTCACACTGGTTTCAGTC 
TAAGTTTATGTTGTTCTAAGAAGGCCAAAATGGGAAAGCAAGATAGGGAA 
AAATAGTGTATTTCAGTGGAAAGGGTATTTTAGGCATTTTCTGTCAAAAG 
5 TTGTTATTGCAGGCTTTTTAGTACCTGGAATCGTGTGTGGGAGGAGCATT 
ATTATTCTGATTTGCTTGTTTCTTTATCATTTTTTCTTAGCCTCTCGAAC 
AGCTAGAAACCCTTTTAATCTTTTGATTTTCAATGACGAAATTTTTCCCT 
GTTACTCCATTTGATTGTTGTTCTTCATGGTTCTAAGTGAGTTATTGGCT 
CATCTGTTACTTCTTITGATTGTTATTTTCATATCATGTTGTCCTTTGAA 

10 TCAAGCTTTTCCATTTTCAACCAGGGCAAAAGGTCAAAAGTAACCTACTT 
TATGAGATCAAAAACAGCAACCCATCGGATAACTTTTAGTTGGAGTTAAT 
AGTTACAATTACCATTGTGATTAATAATTATAATATCTTGTATTAATTCA 
TAAAAATTGGTACAGCACATATATGACATTTCAAAGGTTTTTGTTTGACA 
TATATATGCCTCTGGCGTTTTCTTTATTGGACTTGCAGACCTCATTCCAA 

15 AGTTTATACGGTGACACCTTGGGCCCTGCTACTTCAGAAGGGACAACTTG 
GTCTTTTCATAACTTTATCGAATTAGATGTGGAAGGTAATCATGATGTTA 
AAA.\GATTATTCCATCCAGTGAGTTGCTGCAACTGCAAAAGCTGGAAAAG 
ATT.AATGTAAGGTGGTGTAAAAGGGTAGAGGAGGTATTTGAAACTGCATT 
GGA.A.GCAGCAGGGAGAAATGGAAATAGTGGAATTGGTTTTGATGAATCGT 

20 CAC.\AACAACTACCACTACTCTTGTCAATCTTCCAAACCTTAGAGAAATG 
AACTTATGGGGTCTAGATTGTCTGAGGTATATATGGAAGAGCAATCAGTG 
GACAGCATTTGAGTTTCCAAACCTAACAAGAGTTGATATCTATAAATGTA 
AAAGGTTAGAACATGTATTTACTAGTTCCATGGTTGGTAGTCTATCGCAA 
CTCCAAGAGCTACATATATCCAACTGCAGTGAGATGGAGGAGGTGATTGT 

25 TAAGGATGCAGATGATTCTGTAGAAGAAGACAAAGAGAAAGAATCTGATG 
GGGAGACGAATAAGGAGATACTTGTGTTACCTCGTCTAAACTCCTTGATA 
TTA.AGAGAACTTCCATGTCTTAAGGGGTTTAGCTTGGGGAAGGAGGATTT 
TTCATTCCCATTATTGGATACTTTAAGAATTGAGGAATGCCCAGCAATAA 
CCACCTTCACCAAGGGAAATTCCGCTACTCCACAGCTAAAAGAAATTGAA 

30 ACACATTTTGGCTCGTTTTGTGCTGCAGGGGAAAAAGACATCAACTCTCT 
TAT.^AAGATCAAACAACAGGTAAATCAGATCTTTGTTGCTTTAATAATTC 




AAAACCGCAACCTACATTTTCAGCTTTATATTTATGTACTTTATGCAGGA 
GTTCAAACAAGACTCTGATTAATGTGAAGTAAATACTAAAGGTAAATTAT 
35 ATTTTCATGTTCCTAGTTGCCTATTAATTAATTGCCTTTTAGTTCATGAT 
TTTTGGATGCATTCTTCATGATGATGTCAATCTTCTAATACCCCATTCAT 
TGTTTGGTTGAATGTTGACTCTATGTCTTGATGAATATTCAAGGGAAGAA 
TTGTTCATCATATGAAGGACATTAAAGAAGAACATGGATGCTATGAAGAT 
GTGGGAAAACAA 
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RG2B deduced polypeptide sequence (SEQ ID NO:90) 

MSDPTGIAGAIINPIAQTALVPVTDHVGYMISCRKYVRVMQMKMTELNTSRISVEE 

HISRNTRNHLQIPSQTKEWLDQVEGIRANVENFPIDVITCCSLRIRHKLGQKAFKTTE 

QIESLTRQLSUSWTDDPVPLGRVGSMNASTSASLSDDFPSREKTFTQALKALEPNQK 

FHMVALCGMGGVGKTRMMQRUOCAAEEKKLFNYIVGAVIGEKTDPFAIQEAIADY 

LGIQLNEKTKPARADKlllEWFKKNSDGGKTKFLIVLDDVWQLVDLEDIGLSPFPNQ 

GVDFKVLLTSRDSQVCTMMGVEANSnNVGLLTEAEAQSLFQQFVETSEPELQKIGE 

DIVRKCCGLPIAKTMACTLRNKIUaDAWKDALSRIEHYDIHNVAPKVFETSYHNLQ 

EEETKSTFmCGLFPEDFDIPTEEU^YGWGUOJ'DRVYTIREARTRLNTCIERLVQ 

TNLUESDDVGCVOfflDLVRAFVUSMFSEVEHASIVNHGNMPGWPDENDMIVHSC 

mSLTCKGMIEIPVDLKFPKLTILKIJ^GDKSLRFPQDFYEGMEKLHVISYDKMKY 

PLLPLAPRCSTNIRVLHLTECSLKMFDCSSIGNLSNLEVLSFANSfflEWLPSTVRNLK 

KLRLLDLRFCDGLRIEQGVLKSFVKLEEFYIGDASGFIDDNCNEMAERSYNLSALEF 

AFFNNKAEVKNMSFENLERFKISVGCSFDENINMSSHSYENMLQLVTNKGDVLDSK 

LNGLFLKTEVLFLSVHGMNDLEDVEVKSTHPTQSSSFCNLKVLHSKCVELRYLFKL 

NLANTLSRLEHLEVCECENMEELIHTGIGGCGEETITFPKLKFLSLSQLPKLSSLCHN 

VNnOLPHLVDULKGIPGFTVIYPQNKLRTSSLLKEGVVIPKLETLQlDDMENLEEIW 

PCELSGGEKVKLRAIKVSSCDKLVNLFPRNPMSLLHHLEELTVENCGSIESLFNIDLD 

CVGAIGEEDNKSLLRSINVENLGKLREVWRIKGADNSHLINGFQAVESIKIEKCKRFR 

NIFTPITANFYLVALLEIQIEGCGGNHESEEQIEILSEKETLQEATGSISNLVFPSCLMH 

SFHNLRVLTLDNYEGVEVVFEIESESPTCRELVTTRNNQQQPIILPYLQDLYLRNMD 

OTSHVWKCSWNKFFTLPKQQSESPFHNLTTINILKCKSIKYLFSPLMAELLSNLKDI 

WSECDGKEVVSNRDDEDEEMTTFTSTHTTTTLFPSLDSLTLSFLENLKCIGGGGAK 

DEGSNEISFNNTTATTAVLDQFELSEAGGVSWSLCQYAREIEIVGCYALSSVIPCYAA 

GQMQKL 

RG2C polynucleotide sequence (SEQ ID NO:91) 

ATA,A.TATTACACAAAGGTAACGTCATTAATTAATTACGATACGAGACAGA 

CTTTTTCACTCGGACATNAACGGTCTATTCCTAACTTNANNTAATTNAAT 

GAATTTAGGATGTGCTAATATGCATGTAANATTCGCTACCGTCATCTTTC 

AAATGACCATATTTTTATGTATTTATAATGAATCAATGAAAAACCGGATT 

TCTATTTAAAATTCTTAAAACTTCATCTTTTAAGCCAGGGTGAATACAAT 

TGCTAGATCCACTGTTAATTTCCATCGAATTATGCCTGATCAATTGTTGG 

CTGCCTACGATGCAGGTGCTACCACAAGAATATGGCCATGGAAACTGCTA 

ATGAAATTATAAAACAAGTTGTTCCAGTTCTCATGGTTCCTATTAACGAT 

TACCTACGCTACCTCGTTTCCTGCAGAAAGTACATCAGTGACATGGATTT 

GAAAATGAAGGAATTAAAAGAAGCAAAAGACAATGTTGAAGAGCACAAGA 

ATCATAACATTAGTAATCGTCTTGAGGTTCCAGCAGCTCAAGTCCAGAGC 

TGGTTGGAAGATGTAGAAAAGATCAATGCAAAAGTGGAAACTGTTCCTAA 

AGATGTCGGCTGTTGCTTCAATCTAAAGATTAGGTACAGGGCCGGAAGGG 

ATGCCTTCAATATAATTGAGGAGATCGACAGTGTCATGAGACGACACTCT 

CTGATCACTTGGACCGATCATCCCATTCCTTTGGGAAGAGTTGATTCCGT 
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GATGGCATCCACCTCTACGCTTTCAACTGAACACAATGACTTCCAGTCAA 
GAGAGGTAAGGTTTAGTGAAGCACTCAAAGCACTTGAGGCCAACCACATG 
ATAGCCTTATGTGGAATGGGGGGAGTGGGGAAGACCCACATGATGCAAAG 
GCTGAAGAAGGTTGCCAAAGAAAAGAGGAAGTTTGGTTATATCATCGAGG 
5 CGGTTATAGGGGAAATATCGGACCCCATTGCTATTCAGCAAGTTGTAGCA 
GATTACCTATGCATAGAACTGAAAGAAAGCGATAAGAAAACAAGAGCTGA 
GAAGCTTCGTCAAGGGTTCAAGGCCAAATCAGATGGAGGTAACACTAAGT 
TCCTCATAATATTGGATGATGTCTGGCAGTCCGTTGATCTAGAAGATATT 
GGTTTAAGCCCTTCTCCCAATCAAGGTGTCGACTTCAAGGTCTTGTTGAC 

10 TTCACGAGACGAACATGTTTGCTCAGTGATGGGGGTTGAAGCTAATTCAA 
TTATTAACGTGGGACTTCTAATTGAAGCAGAAGCACAAAGATTGTTCCAG 
CAATTTGTAGAAACTTCTGAGCCCGAGCTCCACAAGATAGGAGAAGATAT 
TGTTAGGAGGTGTTGCGGTCTACCCATTGCCATCAAAACCATGGCGTGTA 
CTCTAAGAAATAAAAGAAAGGATGCATGGAAGGATGCACTTTCTCGTTTA 

15 CAACACCATGACATTGGTAATGTTGCTACTGCAGTTTTTAGAACCAGCTA 
TGAGAATCTCCCGGACAAGGAGACAAAATCTGTTTTTTTGATGTGTGGTT 
TGTTTCCCGAAGACTTCAATATTCCTACCGAGGAGTTGATGAGGTATGGA 
TGGGGCTTAAAGTTATTTGATAGAGTTTATACAATTATAGAAGCAAGAAA 
CAGGCTCAACACCTGCATTGACCGACTGGTGCAGACAAATTTACTAATTG 

20 GAAGTGATAATGGTGTACATGTCAAGATGCATGATCTGGTCCGTGCTTTT 
GTTTTGGGTATGTATTCTGAAGTCGAGCAAGCTTCAATTGTCAACCATGG 
TAATATGCCTGGGTGGCCTGATGAAAATGATATGATCGTGCACTCTTGCA 
AAAGAATTTCATTAACATGCAAGGGTATGATTGAGTTTCCAGTAGACCTC 
AAGTTTCCTAAACTAACGATTTTGAAACTTATGCATGGAGATAAATCGCT 

25 AAAGTTTCCTCAAGAATTTTATGAAGGAATGGAAAAGCTCCGGGTTATAT 
CATACCATAAAATGAAGTACCCATTGCTTCCTTTGGCACCTCAATGCTCC 
ACCAACATTCGGGTGCTTCATCTCACGGAATGTTCATTAAAGATGTTTGA 
TTGCTCGTGTATTGGAAATCTATCGAATCTGGAAGTGCTGAGCTTTGCTA 
ATTCTTGCATTGAGTGGTTACCTTCCACGGTCAGAAATTTAAAAAAGCTA 

30 AGGTTACTTGATTTGAGATTGTGTTATGGTCTCCGTATAGAACAGGGTGT 
CTTGAAAAGTTTGGTCAAACTTGAAGAATTTTATATTGGAAATGCATATG 
GGTTTATAGATGATAACTGCAAGGAGATGGCAGAGCGTTCTTACAACCTT 
TCTGCATTAGAATTCGCGTTCTTTAATAACAAGGCTGAAGTGAAAAATAT 
GTCATTTGAGAATCTTGAACGATTTAAGATCTCAGTGGGATGCTCTTTTG 

35 ATGGAAATATCAATATGAGTAGCCACTCATACGAAAACATGTTGCGATTG 
GTGACCAACAAAGGTGATGTATTAGACTCTAAACTTAATGGGTTATTTTT 
GAAAACAGAGGTGCTTTTTTTAAGTGTGCATGGCATGAATGATCTTGAAG 
ATGTTGAGGTGAAGTCGACACATCCTACTCAGTCCTCTTCATTCTGCAAT 
TTAAAAGTCCTTATTATTTCAAAGTGTGTAGAGTTGAGATACCTTTTCAA 

40 ACTCAATGTTGCAAACACTTTGTCAAGACTTGAGCATCTAGAAGTTTGTA 
AATGCAAGAATATGGAAGAACTCATACATACTGGGATTGGGGGTTGTGGA 
GAAGAGACAATTACTTTCCCCAAGCTGAAGTTTTTATCTTTGAGTCAACT 
ACCGAAGTTATCAGGTTTGTGCCATAATGTCAACATAATTGGGCTACCAC 
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ATCTCGTAGACTTGAAACTTAAGGGCATTCCAGGTTTCACAGTCATTTAT 
CCGCAGAACAAGTTGCGAACATCTAGTTTGTTGAAGGAAGAGGTAGATAT 
ATGTTCTTTATGTTAATACAATTTAAACAATATTTTCAACCAAATTTTCA 
TAATATATCTGTAATTTGATTGTATGATGTGTTATTGTTTATATGTGGCT 

5 ATTAAGGGATGATAATTTTGCAGGTTGTGATTCCTAAGTTGGAGACACTT 
CAAATTGATGACATGGAGAACTTAGAAGAAATATGGCCTTGTGAACTTAG 
TGGAGGTGAGAAAGTTAAGTTGAGAGAGATTAAAGTGAGTAGCTGTGATA 
AGCTTGTGAATCTATTTCCGCGCAATCCCATGTCTCTGTTGCATCATCTT 
GAAGAGCTTACAGTCGAGAATTGCGGTrCCATTGAGTCGTTATTCAACAT 

10 TGACTTGGATTGTGTCGGTGCAATTGGAGAAGAAGACAACAAGAGCCTCT 
TAAGAAGCATCAACGTGGAGAATTTAGGGAAGCTAAGAGAGGTGTGGAGG 
ATAAAAGGTGCAGATAACTCTCATCTCATCAATGGTTTTCAAGCTGTTGA 
AAGCATAAAGATTGAAAAATGTAAGAGGTTTAGAAATATATTCACACCTA 
TCACCGCCAATTTITATCTGGTGGCACTTTTGGAGATrCAGATAGAAGGT 

15 TGCGGAGGAAATCACGAATCAGAAGAGCAGGTAACGCTTTCAATTTCACT 
TTCTTAATTAATTANGGACTAAGCTCCTGTTTTTTGAATAATAAAGAGGT 
GGGATGACTAAACTTGGGCATCACAATTGCAACAAAATGTTACAAACCAT 
GAAACGCTCAAACCATTTCTTGAATTAAGGTTTCAATACAAGTCATTTAA 
AAATATGGCTTAAATTTTTTTATATTTATGTATCAACATGATTTTTCATT 

20 AGAGATCATTATTATAATAGTAAGTTTAAAGCAATTTAAATTAGAACTAA 
TTCTAACTTTAGCTAATAAATCGTTATAAATGTAAATAATTACTTTTTAG 
TGAAATAAGCAACGGATTTAATAAGTTAACAACTTAAATGTCATTTCCTA 
ACAAAAAAAACTATTTGGTTCAGAAAAACTGTAATTCAAGATAACTAAAA 
TAAAAATATTTGACATTCACTAAGAGCATTTTTTTCTAAATATGATTGCA 

25 AATGAATAAAACTTAAATTTATACAGAAAAGATTTTTATATATGTTATAC 
AAA.\TTTACAAATTGAAATTGGATATGTTAATTAACGGTTTATAATTCTG 
GTATCACAAAGGGATATATAATAAAATATTATTTTTCTGTAGTCATTTAT 
AATTGTACTAGTTTATAACCCGTGGGAACCATGAGTTCTAAAATTAGTTA 
AACTTTCATAATAAAAATTTATAATTATTATTTATTTTAAATAAATTATT 

30 AATTAAGAGATATATCAAAAATTTAAAGTTATTATAACTTCAAATTTAAC 
ATATAATTAAAAAATATATGATCATAACTTTCCGCAACTCTTCTTTTGTA 
TTA.\AATGACCAGAGAAGCTCTTAGTATATTTTCTAAATCAAAGTCACAA 
AACTAATGAAGCATATAATTTTGTGAAAATCAATTAGCATTAGGTTTTAA 
GAGTCACCAAATTCAAAGAGTAATCCAATGCTTTCATTACCACTATGGAG 

35 AAAATATTTTCTTAGTTTAAATGAAATGAAAACAAACATTCAAACTAATT 
GTTGCTTATTAAACCAAAGACCCATTACTTAGCCAAGAGTTTAACCAAAA 
AAAATTACATTCATGTATCATTATTAATGACTAGATATATATGAATATGA 
AGGGAGTTTTTATAGAAAATATAATCATAGATATTCAACATAACTTCATG 
GAATTCCTCAAAATAACCAAGTTATTCAAGAAATTACATCCAAGTCAACC 

40 AAAGAGAAGTTTAGCCTAGCATGGCTAAACTCAAGAAAATAAAATAAGGA 
TTAGAAGTACCAAACATGTAGTAAGAATCACAGTAAAAGATGATGTTGTT 
CTTGATGTTCTTCTAAGTTCTTCAAGTCTCCAGTTGCTCCTAATAATGCA 
AAGGAGAGCCATTAAATTCGTATGTATTGATCCCTTCAAAAGCTGCACCA 
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ACCTCCCTTAAATAACACTCAAAGCAAAAATGACAAAATTGCCCCTGAAG 
GACGCTATGCGGGTGCCTTGCGCGGGTGGAGCTGAATATGAAAGGTCTTT 
GGTCTTTGTGAGGGTGATGTTGTGCGGGTTAGCTTGTCGCATGCTTCCGC 
GCGGTTCGCGCACATGTGCACAGGTGATGCATGGTGTGTACGTTCTTGAC 
5 TTTTGAGCCTCCGATGCTTAGTCCACTTGGCCCAATTCGAGTCCAATCAA 
CTTATGACCCATTTTTCTTCAAGTTATCTTCAAGTTAAGCCCAATTTGCC 
TTCTCCAAATCATCCATAACTTCACAGAATCGCCCGTTCATCTTAATCCC 
GAATGAACAATTATTCTCCCGTCTTCATTTTAAGCAAGATACCACCTTCT 
TCATGCTTCATCCATCAATAGTACACTTCATGTATCATCTCTACTAGTTA 

10 TTTAGTCCACAGTCCTTGTTGTCCTCCAAATTTAATTATCTCATTTAGTT 
CCCGTTCCGCTAGTTTCCTTAAAATTTGCAATTAAGCTCACAGAAATATT 
AAGTACCCGAAATGGTCATAAAATAACAGAAAGGAAAATATGCATGAAGA 
TTAACTAAATGATGAACGAAATATGCTAAAATAGACTATAAAATGAAGTA 
AATAAAATGAAATTATCGCACTCCGACCACCCTTATAGGCTTGTAGTCCA 

15 CCCACCCTTCATTCCTTGTACCAATATGGGATGGAAACATCATTAATTAA 
GCC.AAAAAACTAACATATAAGGGGTGAGTGACAAAGGTAAGTACTAAAGA 
TGAAAAAAATCCATTTTTCTTGTATATACACAACACACACATAGGGGCAG 
ACGTAGGATTTCATAGTACAGATTGTTGGTGGCACATAAGTGTTGCTAGT 
GACATTTTTTTTTTCTTTTACGTAGTGGCACAACAGTARAAAAAACRAAA 

20 AATTCGAAATTTTTTACAATGTGCCTAAAAAAAACAGTGGTTGTTGGTGC 
CACTATGGACACCAAAGTTGAACTGCCCCTGCGCGCGCACACACACACAC 
ACATAAAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGTTTG 
GGATGTGATACTTCTTTTGGGAAAATGGAGTTATATCTTTGATATTGTAT 
TTTTTTAATGTAATTTATATATTTAATCATTTTAGTTTATAAGTTTTATT 

25 TATTTKGATATGAAAAAAAAAGTCTTTTATACATTGGATTTAACATAAAA 
ATCCAACAATATTAATCAAAAAGACCAAACATGTGGACAATTATGTATAT 
AATTAATTCACAATAGTCTTTAGGAATAGNATTATATATATAATTAATTC 
TCA.\TGGTCTTAGGAATAGTAAGTTCTTATATTTCAAACNTTTGCCACAN 
TTCTTTGNTTACTTNGACACTTTYCTCTMWNNANWMWWTWATATATATAT 

30 ATATATATATATAHAHAHAHAVACACACACACTAGATGTGTGCCMGCGCA 
AAGCAGTGACGTNNNGGAGAANACTTTCTTAAGCATAAATAATTATTATA 
TTTTTTATTGGGTATTATATAATAAAAAATTACAACTTTTAAATAAAATA 
TTTATGTTTATACTTTATATTTATATTGCTTGTATACTATTAATATAATA 
AATTAATATTTATGTCTAATTTATGAAATGTAAATTAATTTAAATACATG 

35 AATTTAATATTTTTAAAATTTTCAGTTTGCTTCAAATTGAGTTTCTTAAT 
TATTGACCAAACATGTGGACAATTATGTATATAATTAATTCACAATAGTC 
TTTAGGAATAGTATTATATATATAATTAATTCTCAATGGTCTTAGGAATA 
GTAAGTTCTTATATTTCAAACTTTTGCCACAATTCTTTGCTTACTTTGAC 
ACTTTTCCTTCCTAACTTTACATATATATATATATTAAAGCGCAAAGGTC 

40 ATAGGAATATAATATTTTCTATTATTCTACGTTTTGCCACAAAAGTTTGA 
ACACTTTGCCACTTTTTGTCCCTCCTTAACCTTTTCAATGTTTTGCGACA 
AA.\GTTCCAAAACTTTGCCACTTTGATCATTCCTCAACTTTTCACCGCAT 
TAGTTTGTGGAGTTGGCAGTTTTGGTCCCTCTAACTTCGATATTCTCTAC 
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TGCTAGCCAAAAAGGGTTCCAGAGTTTCACACTTTTGGTCCCTGACAGTA 

ACCAAATGTGAGATGTCAAATTTTTGCCACATTAGTTTGTGGAGTTGTCC 

CTTTTGGTCCCCCCACATTCGATATTCTACTATACGATCTTATTTTTCTC 

AAATAACAACACGTATATTTAATTACTAATGATAGAAATAGATATCAAAT 

AAAGTATTTGTAACACTGTGTAGAGTTTTTTTTTACAAGTTTGTATCAAA 

TCATATCAAAATTTAAGGTGGAACGGTGACCACATTAACCAGAAATGTAA 

TTTATTCTTTGATTTTGATAATTmAATATTTTGTTGTGATCTATGTAT 

TTAAAAGTAAACAACAAAGAACATAATCCAAAACCCTAAATTGCAAGTCT 

CGCCCAATTTCTCTATCACTAGTCCTCACTTACGATGGCGTTACGTCGCT 

CTCTCACTGCTTACAACCCTTTGTTGCTACTCATTACAATAACGAAAAGT 

TGAATATCCATATATTTATTTGGATGTGGAATTGAACGAATCTCGTCAAA 

TTTTTGATTTAGTTGATGGATTTGAGTAGAAGTTTGGGCAGAACGGGAAT 

GATGGTCTGCAAGTGGTTATAAACTTGATTCTGAGTTATTACTATATATG 

TAGCCTCTTTACAACGACCAAGGTTTCTTCCAGGTACCATTTGATCTTTT 

TAG.^CTTAGTTTTCTGAAACACCCTGATTTGGATCAAATATCACCAACA 

ACTCTTAAAAACTTGATTAATCAATTGTTTACTTCATCTTGATAACAAGT 

GGAATGATTTTCTACTTGAAAAAAAAGGTCCATGTGCGTCTGGTGGATCT 

GGT.AAATGAAGATGGAAGGGAGAGCTGACTTTAAAGACACAAACACGTCA 

CCATATCTTTTATTTTATTTTAAATTTTCTTTTTTCCTATTTCTTTCTTT 

CTTGATCTCCAGATGGTATGTGGTGTGGATAATTTACACATAGAGATTGG 

GAACGACTGTGATTTAGAGAGGACGTGGCTTGGGGTTGAGGATGGTTTAT 

GGCTGGCCGAGTTTCATTTATATAAACAAACAAATATATAAAACAAGGGG 




CTCTTATTCCCAACCAGTCAAATAGGGACTTAGGTTGTTTGGAAACAGTT 

CCGTGAGACCGTGACTTGGATGGTAGATAAATTTAGTAAACTTAACCCTT 

CAATTAACCTACCTTTTTCTTATTAACTCAATTTCAACCTAAATTCTGAT 

TCTTGTTTGAAAATAAGTTGCATCTTTATTTTTGTATTATCTTGTTGCAT 

AGGATCCTTAGCATCTTTTAATAATTTATTTGAAGGTGAAAGATCCAACT 

ATTTTTAATCTGTTGACGTTTTCCATCATTTGCAACTGTTTCTTGAAAAA 

AAA-\TACCTAAAATCAAAATAACCATTTTCAAATCCAAAATTATAAGAGA 

GAATTGTAAATGGACATGGAATCATAAATCATTAACACAGTTCAGTAAAC 

AAGTTGCTAATTACATTTCTTGCTGTGCAGATTGAAATTCTATCAGAGAA 

AGAGACATTACAAGAAGCCACTGGCAGTATTTCAAATCTTGTATTCCCAT 

CCTGTCTCATGCACTCTTTTCATAACCTCCGTGTGCTTACATTGGATAAT 

TATGAAGGAGTGGAGGTGGTGTTTGAGATAGAGAGTGAGAGTCCAACAAG 

TAGAGAATTGGTAACAACTCACAATAACCAACAACAGCCTATTATACTTC 

CCTACCTCCAGGAATTGTATCTAAGGAATATGGACAACACGAGTCATGTG 

TGG-AAGTGCAGCAACTGGAATAAATTCTTCACTCTTCCAAAACAACAATC 

AGAATCACCATTCCACAACCTCACAACCATAGAAATGAGATGGTGTCATG 

GCTTTAGGTACTTGTTTTCGCCTCTCATGGCAGAACTTCTTTCCAACCTA 

AAGAAAGTCAAGATACTTGGGTGTGATGGTATTAAAGAAGTTGTTTCAAA 

CAGAGATGATGAGGATGAAGAAATGACTACATTTACATCTACCCACAAAA 
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CCACCAACTTGTTCCCTCATCTTGATTCTCTCACTCTAAACCAACTGAAG 
AATCTGAAGTGTATTGGTGGAGGTGGTGCCAAGGATGAGGGGAGCAATGA 
AATATCTTTCAATAATACCACTGCAACGACTGCTGTTCTTGATCAATTTG 
AGGTATGCTTTGTACATATTCAATTATTTATTTAATTTCCTTTTTTATTT 
5 GCAATATTCTATAAATAATACATTTTATACCCACTATACTAAGATAATAA 
TTACCTAGAGGGATGGATGCTATGACACAGCTGCTACACTTCAGAAACTC 
TAGTAAGGGCAGTTATGGAAGTTCAATAAAATGATAATGGCATCTTTTGA 
TGGGTAATATAGGCAATTTAAGTTTTATTTCTGTTAAAGCAGTATTTAGC 
AAGTACTGGCCAGTAGGAGAGGAGAATATCACCTTTTGTGAAAATCTGGT 

10 CATTGTACCCAGAATTTAGTTAAATGTAACATTTTAGATATTAGGGGTTA 
TCAGGTGACAGATATTGTAGAATAGAACAATATGTAATATTACCCAAAAC 
TATTTTTTCrAAGGTTGCTCTGTTAAATATGTGCTTTCITGAm 
AATTTGCATTCCTATATTTTAGGTGGTAAAGTGATTGTCTCTTCAATAAA 
TCCCGAAATTAATTAAAAAAAAAAAAACAAAAGTAAATTTTTGATATGGA 

15 GAGCACTGGTATCATTTAGTATATAAAAAAACTAGATTTTGAATTAAGTT 
TCTTATATAAAAGCTGTGTATATAGTTTAATTAGTTTTACATCATTTTTC 
CATGTGGTGTTGCAGTTGTCTGAAGCAGGTGGTGTTTCTTGGAGCTTATG 
CCA,\TACGCTAGAGAGATAAAAATAGGCAACTGCCATGCATTGTCAAGTG 
TGATTCCATGTTATGCAGCAGGACAAATGCAAAAGCTTCAAGTGCTGAGA 

20 GTA.\TGGCTTGCAATGGGATGAAGGAGGTATTTGAAACTCAATTAGGGAC 
GAGCAGCAACAAAAACAACGAGAAGAGTGGTTGTGAGGAAGGAATTCCAA 
GAGTAAATAACAATGTTATTATGCTTCCCAATCTAAAGATATTAAGTATT 
GGA.\ATTGTGGGGGTTTGGAACATATATTCACATTCTCTGCACTTGAAAG 
CCTGAGACAGCTCCAAGAGTTAACGATTAAGGGTTGCTACAGAATGAAAG 

25 TGATTGTGAAGAAGGAAGAAGATGAATATGGAGAGCAGCAAACAACAACA 
ACA,\CAACGAAGGGGGCATCTTCTTCTTCTTCTTCTTCTAAGAAGGTGGT 
GGTCTTTCCTTGTCTAAAGTCCATTGTATTGGTCAATCTACCAGAGCTGG 
TAGGATTCTTCTTGGGGATGAATGAGTTCCGGTTGCCTTCATTAGATAAA 
CTTATCATCGAGAAATGCCCAAAAATGATGGTGTTTACAGCTGGTGGGTC 

30 CACAGCTCCCCAACTCAAGTATATACACACAAGATTAGGCAAACATACTC 
TTGATCAAGAATCTGGCCTTAACTTTCATCAGGTACATATATATTCCTTT 
AATTGGCATCATCTAATTAAGAAAGATATCATTCCTGCCAAGTAAATTTA 
CTTCAAACACATTCACACTAGTTTCAGTCCAAGTTTATGTTGTTCTAGGA 
AGGCCAAAATGGGAAAGCAAGATAGGGAAAAATAGAGTATTTCAGTGGAA 

35 AGGGTATTTTAGGTATTTTCTGTCAAAAATTGTTATTGCAGGCTTTTTAG 
TACCTGGAAGAGCATGATTATTCTCGATTTGCTTGTTTCTTTATCATTTT 
TCTTAGCCTAGCATGATTTTCAATGAAATCTTTCCCTGTTACTCCATTTG 
ATTGTTGTTCTTCATGGTTCTAAGTGAGTTAGTGGCTCATCTGTTACTTC 
TTTTGATTGTTATTTTCATAGCATGTTGTCACTTGAATCAAGCTTTTCCA 

40 TTTTCAACAAGGACAAAAGGTCAAAACTAACCTACTTTATGAGATCAAAA 
ATAGCAACCCATCGGATAACTTTTAGTTGGAGTTAATACTTACAATTACC 
ATTGTGATTAATAATTATAATATCTTGTATTAATTCATAAAAATTGGTAC 
AGCACATATATGACATTTCAAAGGTTTTTGTTTGACATATATATGCCTCT 
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GGCGTTTTCTTTATTGGACATGCAGACTTCATTCCAAAGTTTATACGGTG 
ACACCTTGGGCCCTGCTACTTCAGAAGGGACAACTTGGTCTTTTCATAAC 
TTTATCGAATTAGATGTGAAATCTAATCATGATGTTAAAAAGATTATTCC 
ATCCAGTGAGTTGCTGCAACTGCAAAAGCTGGTAAAGATTAATGTAATGT 

5 GGTGTAAAAGGGTAGAGGAGGTATTTGAAACTGCATTGGAAGCAGCAGGG 
AGAAATGGAAATAGTGGAATTGGTTTTGATGAATCGTCACAAACAACTAC 
CACTACTCTTGTCAATCTTCCAAACCTTGGAGAAATGAAGTTACGGGGTC 
TCGATTGTCTGAGGTATATATGGAAGAGCAATCAGTGGACAGCATTTGAG 
TTTCCAAACCTAACAAGAGTTGAAATTTATGAATGTAATTCATTAGAACA 

10 TGTATTTACTAGTTCCATGGTTGGTAGTCTATTGCAACTCCAAGAGCTAG 
AGATTGGTTTGTGCAACCATATGGAGGTCGTGCATGTTCAGGATGCAGAT 
GTTTCTGTAGAAGAAGACAAAGAGAAAGAATCTGATGGCAAGATGAATAA 
GGAGATACTTGTGTTACCTCATCTAAAGTCATTGAAATTACTACTTCTTC 
AAAGTCTTAAGGGGTTTAGCTTGGGGAAGGAGGATTTTTCATTCCCATTA 

15 TTGGATACTTTGGAAATCTACGAATGCCCAGCAATAACCACCTTCACCAA 
GGGAAATTCCGCTACTCCACAGCTAAAAGAAATGGAAACAAATTTTGGCT 
TCTTTTATGCTGCAGGGGAAAAAGACATCAACTCCTCTATTATAAAGATC 
AAACAACAGGTAAACCAGATCTTTGTTGCTTTAATAATTCTTAAACTACA 

20 CCTACATTTTCAGCTTTATATTTATGTACTTTATGCAGGATTTCAAACAA 
GACTCTGATTAATGTGAAGTGAATATTAAAGGTAAATTATATTTTCATGT 
TCCTAGTTGCCTATTAATTAAAGGCCTTTTAGTTCGTGATTTTTGGATGT 
ATTCTTCATGATGATGTCAATCTTCTAATACCCCATTCATTGTTTGGTTG 
AATGTTGACTCTATGTCAGGATGAATATTCAAGGGAAGAATTGTTCATCA 

25 TATGAAGGACATTAAAGAACATGGTGCTAT 

RG2C deduced polypeptide sequence (SEQ ID NO:92) 

MA^IETANEIIKQVWVLMWINDYlJRYLVSCRKYISDMDlJCMKElJKEAKDNVEEH 
KNHNISNRLEVPAAQVQSWLEDVEKINAKVETVPKDVGCCFNLKIRYRAGRDAFNI 

30 lEEIDSVMRRHSLITWTDHPIPLGRVDSVMASTSTLSTEHNDFQSREVRFSEALKALE 
ANHMIALCGMGGVGKTHMMQRLKKVAKEKRKFGYIIEAVIGEISDPIAIQQVVADY 
LCmUCESDKKTRAEKLRQGFKAKSDGGNTKFLIILDDVWQSVDLEDIGLSPSPNQG 
VDFKVLLTSRDEHVCSVMGVEANSnNVGLLIEAEAQRLFQQFVETSEPELHKIGEDI 
VRRCCGLPIAIKTMACTLRNKRKDAWKDALSRLQHHDIG^fVATAVFRTSYENLPD 

35 KETKSVFLMCGLFPEDFNIPTEELMRYGWGLKLFDRVYTnEARNRLNTCIDRLVQT 
NLUGSDNGVHVKNIHDLVRAFVLGMYSEVEQASIVNHGNMPGWPDENDMIVHSC 
mSLTCXGMIEFPVDIJKFPKLTIIJCLMHGDKSUCFPQEFV'EGMEKlJlVISYH^ 
PLLPLAPQCSTNIRVLHLTECSLKMFDCSCIGNLSNLEVLSFANSCIEWLPSTVRNLK 
KLRLLDLRLCYGLRIEQGVLKSLVKLEEFYIGNAYGFIDDNCKEMAERSYNLSALEF 

40 AFFNNKAEVKNMSFENLERFKISVGCSFDGNINMSSHSYENMLRLVTNKGDVLDSK 
LNGLFUCreVUTLSVHGMNDLEDVEVKSTHPTQSSSFCNLKVUISKCVELRYLFKL 
NVANTLSRLEHLEVCKCKNMEELIHTGIGGCGEEnTFPKLKFLSLSQLPKLSGLCH 
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NVNnGLPHLVDLKLKGIPGFTVIYPQNKLRTSSLLKEEVVIPKLETLQIDDMENLEEI. 
WPCELSGGEKVKLREIKVSSCDKLVNLFPRNPMSLLHHLEELTVENCGSIESLFNID 
LDCVGAIGEEDNKSLLRSINVBNLGKLREVWRIKGADNSHUNGFQAVESIKIEKCK 
RFI^TFTPITANFYLVALLEIQIEGCGGNHESEEQIEILSEKETLQEATGSISNLVFPSC 
5 LMHSFHNLRVLTLDNYEGVEVVFEIESESPTSRELVTTHNNQQQPnLPYLQELYLR 
NMDNTSHVWKCSNWNKFFTLPKQQSESPFHNLTTIEMRWCHGFRYLFSPLMAELL 
SNLKKVKILGCDGIKEWS^nlDDEDEEMTTFTSTHKTTNLFPHLDSLTLNQlJCNIJ^ 
CIGGGGAKDEGSNHSFNNTTATTAVLDQFELSEAGGVSWSLCQYAREIKIGNCHAL 
SSVIPCYAAGQMQKLQVLRVMACNGMKEVFETQLGTSSNKNNEKSGCEEGIPRVN 

10 NN\TMLPNUai^IGNCGGLEHIFTFSALESLRQLQELTIKGCYRMKVIVKKEEDEYG 
EQQTTTTTTKGASSSSSSSKKVVVFPCLKSIVLVNLPELVGFFLGMNEFRLPSLDKLn 
EKCPKMMVFTAGGSTAPQUCYIHTRIXJKHTLDQESGLNFHQTSFQSLYGDTLGPAT 
SEGTTWSFHNFffiLDVKSNHDVKKIlPSSEIXQUJKLVKINVMWCKRVEEVFE^ 
AAGRNGNSGIGFDESSQTTTTTLVNIJPNLGENIKLRGLDCLRYIWKSNQWTAFEFPN 

15 LTR\^IYECNSLEHVFTSSMVGSLLQLQELEIGLCNHMEVVHVQDADVSVEEDKEK 
ESDGKMNmLVIJPHUCSLKLLLLQSLKGFSLGKEDFSFPLLDTLEIYECPAITTFTK 
GNSATPQLKEMETNFGFFYAAGEKDINSSIIKIKQQDFKQDSD. 



RG2D polynucleotide sequence (SEQ ID NO:93) and (SEQ ID NO:94) 

20 ACGACCACTATAGGGCGAATTGGGCCCGACGTCGCATGCTCCCGGCCGCC 
ATGGCCGCGGGATGTAAAACGACGGCCAGTCGAATCGTAACCGTTCGTAC 
GAG.AATCGCTGTCCTCTCCTTCAACCATTTAATGTATATGAGCTAAATTG 
AAACATCTACTATCATGTTTAAATTTATAAACTTTTTCCTTTAGATTCAC 
TTGTCTGGATGTGTTTAATAAAACCCAATTTCCCACATGCGTAGAGATCA 

25 TAGATGTAACTATTGTTAATCAATTTTGCCTGCCAAGTTTTAATAATTAT 
ACTTGGATATTAACAAAACTTTATCTAACGACCAAGGTAATATTAAAAAT 
AGGTTATTATTCTTCATGCTAATTAAAAGATGGGTTGCAAAAGTGAGACC 
ATG.AAAACATTAACACGTTGATATTTTCAACTTTTATTCTTTCATATTCA 
CCATATTTTTTACTTTCGTATTGATTAATCATCTTTCAATCACAGGCTCC 

30 TTGGCAAAAAGTCAGATCTATTAACAAATACTTCCATGTGGTTGCAAATT 
ACAAGGATTTCAACATAATTACCAAAACATAGCATTATCATAAGATCGAA 
TAATAATCAAATTCTTCTATAATATTACACAAAGGTAACGTCATTAATTA 
ATTACGATACGAGACAGACTTTTTCACTCGTGACATCAACGGTCTATTCT 
AACTTTACTTAATTAAATGAATCTAGGATGTGCTCATATGCATGTAATAT 

35 TTGCTACCGTCATCTTTCAAATGACCATATTTTTATGTATTTATAATGAA 
TCA.ATGAAAAACCGGATTTCTATTTAAAATTCTTAAAACTTCATCTTTTA 
AGCCAGGGTGAATACAATTGTAGATCCACTGTTAATTTCCATCGATTATG 
CGTGATCAATTGTTGGCTGCATACGATGCAGGTGCTACCACAAGAATATG 
GCCATGGAAACTGCTAATGAAATTATAAAACAAGTTGTTCCAGTTCTCAT 

40 GGTTCCTATTAACGATTACCTACGCTACGTCGTTTCCTGCAGAAAGTACA 
TCAGTGACATGGATTTGAAAATGAAGGAATTAAAAGAAGCAAAAGACAAT 
GTTGAAGAGCACAAGAATCATAACATTAGTAATCGTCTTGAGGTTCCAGC 
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AGCTCAAGTCCAGAGCTGGTTGGAAGATGTAGAAAAGATCAATGCAAAAG 
TGGAAACTGTTCCTAAAGATGTCGGCTGTTGCTTCAATCTAAAGATTAGG 
TACAGGGCCGGAAGGGATGCCTTCAATATAATTGAGGAGATCGACAGTGT 
CATGAGACGACACTCTCTGATCACTTGGACCGATCATCCCATTCCTTTGG 

5 GAAGAGTTGATTCCGTGATGGCATCCACCTCTACGCTTTCAACTGAACAC 
AATGACTTCCAGTCAAGAGAGGTAAGGTTTAGTGAAGCACTCAAAGCACT 
TGAGGCCAACCACATGATAGCATTATGTGGAATGGGGAGAGTGGGGAAGA 
CCCACATGATGCAAAGGCTGAAGAAGGTTGCCAAAGAAAAGAGGAAGTTT 
GGTTATATCATCGAGGCAGTTATAGGGGAAATATCGGACCCCATTGCTAT 

10 TCAGCAAGTTGTAGCAGATTACCTATGCATAGAGCTGAAAGAAAGCGATA 
AGAAAACAAGAGCTGAGAAGCTTCGTCAAGGGTTCAAGGCCAAATCAGAT 
GGAGGTAACACTAAGTTCCTCATAATATTGGATGATGTCTGGCAGTCCGT 
TGATCTAGAAGATATTGGTTTAAGCCCTTCTCCCAATCAAGGTGTCGACT 
TCAAGGTCTTGTTGACTTCACGAGACGAACATGTTTGCTCAGTGATGGGG 

15 GTTGAAGCTAATTCAATTATTAACGTGGGACTTCTAATTGAAGCAGAAGC 
ACA.\AGATTGTTCCAGCAAnTGTAGAAACTTCTGAGCCCGAGCTCCACA 
AGATAGGAGAAGATATTGTTAGGAGGTGTTGCGGTCTACCCATTGCCATC 
AAAACCATGGCGTGTACTCTAAGAAATAAAAGAAAGGATGCATGGAAGGA 
TGCACTTTCTCGTTTACAACACCATGACATTGGTAATGTTGCTACTGCAG 

20 TTTTTAGAACCAGCTATGAGAATCTCCCGGACAAGGAGACAAAATCTGTT 
TTTTTGATGTGTGGTTTGTTTCCCGAAGACTTCAATATTCCTACCGAGGA 
GTTGATGAGGTATGGATGGGGCTTAAAGTTATTTGATAGAGTTTATACAA 
TTATAGAAGCAAGAAACAGGCTCAACACCTGCATTGAGCGACTGGTGCAG 
GCA.\ATTTACTAATTGGAAGTGATAATGGTGTACACGTCAAGATGCATGA 

25 TCTGGTCCGTGCTTTTGTTTTGGGTATGTATTCTGAAGTCGAGCAAGCTT 
CAATTGTCAACCATGGTAATATGCCTGGGTGGCCTGATGAAAATGATATG 
ATCGTGCACTCTTGCAAAAGAATTTCATTAACATGCAAGGGTATGATTGA 
GATTCCAGTAGACCTCAAGTTTCCTAAACTAACGATTTTGAAACTTATGC 
ATGGAGATAAGTCTCTAAAGTTTCCTCAAGAATTTTATGAAGGAATGGAA 

30 AAGCTCCAGGTTATATCATACGATAAAATGAAGTACCCATTGCTTCCTTT 
GGCACCTCAATGCTCCACCAACATTCGGGTGCTTCATCTCACTGAATGTT 
CATTAAAGATGTTTGATTGCTCTTCTATCGGAAATCTATCGAATCTGGAA 
GTGCTGAGCTTTGCTAATTCTCGCATTGAATGGTTACCTTCCACAGTCAG 
AAATTTAAAGAAGCTAAGGTTACTTGATCTGAGATTTTGTGATGGTCTCC 

35 GTATAGAACAGGGTGTCTTGAAAAGTTTGGTCAAACTTGAAGAATTTTAT 
ATTGGAAATGCATATGGGTTTATAGATGATAACTGCAAGGACATGGCAGA 
GCGTTCTTACAACCTTTCTGCATTAGAATTCGCGTTCTTTAATAACAAGG 
CTGAAGTGAAAAATATGTCATTTGAGAATCTTGAACGATTCAAGATCTCA 
GTGGGGTGCTCTTTTGATGGAAATATCAGTATGAGTAGCCACTCATACGA 

40 AAACATGTTGCAATTGGTGACCAACAAAGGTGATGTATTAGACTCTAAAC 
TTA-\TGGGTTATTTTTGAAAACAGAGGTGCTTTTTTTAAGTGTGCATGGC 
ATG.AATGATCTTGAAGATGTTGAGGTGAAGTCGACACATCCTACTCAGTC 
CTCTTCATTCTGCAATTTAAAAGTCCGTATTATTTCAAAGTGTGTAGAGT 
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TGAGATACCTTTTCAAACTCCATGTTGCAAACACTTTGTCAAGCCTTGAG 
CATCTAGAAGTTTGTGGATGCGAAAATATGGAAGAACTCATACATACTGG 
GATTGGGGGTTGTGGAGAAGAGACAATTACTTTCCCCAAGCTGAAGTCTT 
TATCTTTGAGTCAACTACCGAAGTTATCAGGTTTGTGCCATAATGTCAAC 

5 ATA.\TTGGGCTACCACATCTCGTAGACTTGAAACTTAAGGGCATTCCAGG 
TTTCACAGTCATTTATCCGCAGAACAAGTTGCGAACATCTAGTTTGTTGA 
AGGAAGAGGTAGATATATGTTCTTTATGTTAATACAATTTAAATAATATT 
TTC.\ACCAAAATTTCATAATATATCTGTAATTTGATTGTATGATGTGTTA 
TTGTTTATATGTGGCTATTAAGGGATGATTATTTTGCAGGTTGTGATTCC 

10 TAAGTTGGAGACACTTCAAATTGATGGCATGGAGAACTTAGAAGAAATAT 
GGCCTTGTGAGCTTAGTGGAGGTGAGAAAGTTAAGTTGAGAGAGATTAAA 
GTGAGTAGCTGTGATAAGCTTGTGAATCTATTTCCGCACAATCCCATGTC 
TCTGTTGCATCATCTTGAAGAGCTTAAAGTCAAAAATTGTCGTTCCATTG 
AGTCGTTATTCAACATCGACTTGGATTGTGTCAGTGCAATTGGAGAAGAA 

15 GAC.J^CAAGAGCATCTTAAGAAGAATCAAAGTGAAGAATTTAGGGAAGCT 
AAGAGAGGTGTGGAGGATAAAAGGTGCAGATAACTCTCGTCCCCTCATCC 
ATGGCTTTCCAGCTGTTGAAAGCATAAGTATCTGGGGATGTAAGCGGTTT 
AGA.\ATATATTCACACCTATCACCGCCAATTTTGATCTGGTGGCACTTTT 
GGAGATTCACATAGGAAATTACAGAGAAAATCATGAATCGGAAGAGCAGG 

20 TAACGCTTTCAATTTCACTTTCTTACTTAATTAAGGACTAAGCTCTTGTT 
TTTTGAATAATAAAGAGGTGGGATGACTAAACTTGGGCATCACAATTGTA 
ACAAAATGTTACAAACCATGAACGTACAAACCATTTCTTGAATTAAGGTT 
TCA-ATACAAGTCATTTACAAATATGGCTTAAGTTTTTTTATATTTATGTA 
TCA-ACATTATTTTTCATTAGAGGTCATTATTATAATAGTAAGTTTAAAGC 

25 AATTTAAATTAGCACTAATTTTTCATCATCTAACTTTAGCTAATAAATCG 
TTATAAATGTCAATAGCTAAAATAAAAATATTTGACATTCACTGAGAGCA 
ATTTTTTCTAAACATGATTGCAAATGATTAAAACTTAAATTTAAACTAAA 
AAGATTTTTATATATGTTATACAAAATTTACAAATTGAAATTGGATATGT 
TAATTAACAGTTTATAATTATTGTATTACAAAGCGATATATAATAAAATA 

30 TTATTTTTCTGTAGTCATGTATAATTGTATATGTAAATGATTTTTTAAGA 
TGGTAGAAGTGGAAACTAGTCAATCTCACTTAACTCATTGTCACACCAGT 
TTTATATCCGTTTCTCTCTCTCTCTCTTCTTGCCTCCATCTTTTTTCAAC 
TCATAACACATAAAAATAACATATTTTCCAACACATTTAAGTCACTACCA 
CATCATTATTTTTAATTTAATTAAATTAGAAAATATAAAATTAAATAAAA 

35 CATAACATTTTTTTATTAAAAGGCACTAATACAAATAAAAAGATACACGG 
TAA.\TAAAAAAACGATAATTAGAAAAAAAACATAATAAAAAAAGACAACA 
TTA-\AAATAWAAAGCGACAACTAAAATTAACTAATGATCAAGAAAATTCT 
AAAACTCCCACCATATTTTTCTGCAATTTGTCATTTATGTTCAAACACCA 
TTCGCAGAATCCCTCCTATCAAGTGATCATGTTGATTGAGAAAAAACTGT 

40 ATGTCTCTCTCATGTATCTCCAAGTCCAACAAGTTAGCTTTCATTTCTTC 
ATTTTCTCATGTAAGACGCAAATTTTCATCCCGATATTGTTTTCTATCTT 
CCACCTCTACTTTATTCACAGTGTGGATGAAGGAGAGGACAGCGATTCTC 
GTACGAACGGTTACGATTCGACTGGCCGTCGTTTTACAATCCCGCGGCCA 
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TGGCGGCCGGGAGCATGCGACGTCGGGCCCATTCGCCCTATAGTGGTCGT 
AATACA (SEQ ID NO:93) 

Sequence gap 

TGAGCCTCCGATGCTTAGTCCACTTGGCACAGTTCAAGTCCAATCAACTT 

5 ATAACCCATTTTTCTTCAAGTTGTCTTCAAGTTAAGCCCAATTTGCCTTC 
TCCAAATCATCCATAACTTCATGGAATCGCCCCTTCATCTTAATCCCGAA 
TGCACAATTATTCTCCCATCTTCATTTTAAGCAAGAGGCCACCTTCTTCA 
TGCTTCATCCATCAATAGTCTGTTGGAATAGTGTCTAAGGCTGCAACTAT 
ATTAGACAAGTATTTGACCCGGTTGTGCATGGTCCTTTTGGGTTGCCTTC 

10 ACCATAGCAACTTGATAGGATGATTTATTAAGAGAGAGTAAATATTATTA 
ATATATTATGAGAATAATATAATGAATAATATATTTGTTATTTGATTAAT 
ATAAGTCATAGAATTAATTAGAATTAATTTGGTGACTTAAAGAGATTAAT 
TAAATAAAGGGGTATAAACTGTCAATTGTTTGATAGTTAAGCTTTAGACT 
GTAAATCCATTTGGATATGGTATGGACGAATCCTAAGGGATTTAGGATAG 

15 CTA.\AATCGTCCATATGAGTTATCTAAGAAGGATTTGGATAGCCTTAAGA 
GAAGATTATCTGATAGGGACTTATCTGTAATCCTTAAGGAGTCTACAAGT 
ATA.-\ATAGACCCTATGGCTGATGGAATTCGACACATCTCCTAAAGTAAGA 
GAGCCTTGGCCGAATTCCTCCCCTCACCTCTCTCCTAAATCATTCTTCTT 
GCTATTGGTGTTTGTAAGCCATTAGAGGAGTGACATTTGTGACTCTAGAA 

20 TCTCCAAGACCTCAAGATCAACAAGGAATTCAAAGGTATGATTCTAGATC 
TGTTTCAATGTTGTTATTTGTCCTAATTAGTCATTAGAAGACTTGGATTC 
AAAGCATGTTTATTAGAAAGCCTAGATCYGAGCAATAGGGTTTTGCATGC 
GCACATAGGAAAGTTCTTATGGCTAAAACCCATCATAGTCCACTTCATGT 
ATCATCTCTACTAGTTATTTAGTCCATAATCCTTGTTGTCCTCCAAGTTT 

25 AATTACCTCCCTTAGTTCCTGTTCTGCTAGTTTCCTTAAAATTTGCTATT 

AAGATCACAGAACTAGAGAGTACCCAAAATGGTTATAAAATAACAAAAAG 
GAA.AATATGCATGAAGATTAACTAAATTATAAATGTAATATGCTAAAATA 
AACTATAAAAAAAAAGTAAATAAAATGAAACTATCACACTCCGACCACCC 
TTATAGGCTTGTACTGCACCCACCCTTCATTCCTTGTACCAATATGGGAT 

30 GGA-\ACATTATTCATTAAGCCAAAAAACTAACATTTAAGGGGTGAGTGAC 
AAAGGTAAGTACTAAAGACAACAATAATCCATTTTTCTTGTACATACACA 
ACACACACATAGGGGCGGACGTAGGATTTGTAGTATGTGTTGTGGGTGAC 
ACATTTTTTCTTTTACGTAGTGACACAATAGTAGAGAAAACGAGAAATTC 
CAATTTTTTACATTGTGTTCGAAAAAATATACAGGGGTTGCTGGTGCTAC 

35 TCTGGGCACCAAAGTGGAACCGCCCCTGCACACACACACACATAGAGGGA 
GAGAGAGAGGAGAAAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGATTT 
TGGGATGTGATACTTCTTTTGGGAAAATGGAGCAATATCTTTAATATTGT 
ATTTTTTTAATGTAATTTATATATTTAATCATTTTAGTTTATAACTTTTA 
GTTTTTTTTATTTTAATCTGTATATlTAATCATTTCAGTTTATAAGTTTT 

40 ATTTATTTTGGTATACCAGAAAAAAAAGTCTTTTATGTGTTGGATTTAAC 
ATA.\AAATCTAACAATATTAATCAAAAAGACCAAACATGTGGACAATTAT 
GTATATAATTAATTCTCAATGGTCTTAGTGTAACGATATAAATTTCAAAA 
CAATTTTTCACATTAAAAAAAACACTTTCAGTCATAATTGTTATAAATTA 
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TCATTGTATCACAAAATCAGTTCATAACATCACATCCCAAGATCAATAAA 
GTGTAAATACTCCTCATGTGTGTACTAATCAAGCCGACGCCTTCCCGCGA 
TTCTCACTGGTACCTGAAACACGTAACATAACAACTGTAAGCATAAATGC 
TTAGTGAGTTCCCCAAAATACCACATACCACATATATGCCTTTCCAGGCC 

5 ATA.\CTCTGTAGGATCTTCCGACCCAAGTGTCTCAGGGGACTTCCGTCCC 
GAATCCCGGTAGACCTTCCGGTCCTACCCGTATTGACCTTCCGGTCCGTA 
TCATACATAACATACATAACACATACATATCACATAACAACATATAGCAC 
ATACATCTCATAACATAAAAGACCTTCCGGTCACATAAAGGTACCCTTCC 
AGGTACAGTATAGTGAGAANACTCACCTCGTATGATGTCTAATACCTCAC 

10 GTGCTCGATATCCCTGAATCTCGAAACAATGACCTAGCCCCGCCTACTCA 
CAT.\AAGTAATTATTTCAAATCATTAACGGCTCTCAAGGCTAGACTACAT 
CCCTTTCTATAAATCCACAGAAGGGTAAAAGACCATTTTACCCCTCCTTG 
ACCCAAAAGTCCAAATGTTGATCAAAACCCCAAAAGTCAACGAAAGACAA 
TGGTCAACTTTGACCCTACTCGTGGAGTGCACAAAGGTGACTCGGCAAGT 

15 ACATGCGGGTCCTCTGAATCCTTTCAGTCTCTCTTGGCTCGTCGAGTCTT 
TCTTCCACCCGACGAGTTACACCTGTCATGAATCGCGGGGCAACCCCGAC 
TCGACTTGTCGAGTCCGCTCATGGACTCAACGAGTTCATTCCATGCTCAC 
ACTCAAATGACCTCCTGAGGTCAGATCTGTTCCTCTAATCCATAGATCTG 
ACCTTCCCAAGCTCAATAAACACGTAAAGGTTCGAACTTGATACTCATGC 

20 AACGTCCAAATGATTCTACTTGATGATTTAGCCCCAAATACAACATCCTA 
AGTCCATACGACCTTATTTTTCTCAAATAACAACACATATATTTAATTAC 
CAATGACAGTAATAGATATCATATAAAGTATTTGTAACACTTTGTAAGAA 
CCTTGCTACTATAGGTAAAAAGAAACATTTCAAAGTACATGCCCTAATTA 
GAAAAAAAGTTATAAAAAAATAATGACTAGGGGCGTGTTTTTTTTACTAG 

25 TTTGTATCAAATTATATCAAAATTTAAGGTGGAAAAGAATGACGACCACA 
TTAACCAGAAATGTAATTATTTTTTTATTTGGTAATTTTTAATATTTGTT 
GTGATCTATGTATTTAAAAGTAAATATCAAACAAGAACATAATCCAAACC 
CTA.\ATTGCAAGTCTCGCCCAATTTCTCTATCACTAGTCCTCACTTACGA 
TGGCGTTACGTCGCTCTCTCACTTCCTACAACCCATTGTTGCTACTAATT 

30 ACACTAACGAAAAGTTGAATATCCATATATTTATTTGGATGTGAAATTGA 
ACG.\ATCTCGTCAAATTTTTTATTTTGTTGATGGATTTGAGTGGAAGTTT 
AGGCAGAACGGGAATGATGGTCTGCAAGTGGTTATAAACATGGGTGAAGA 
TAAAATGGAGTTGTCGCCGTTGTATTATAGATCTCTTAGGGGTTTGATTC 
TGAGTTATTACTGTATACGTAGCCTCTTTACAACGACCATTCTTCCAAGT 

35 ACCATTTGATCTTTTTAGAATCCAGTTGTCTGAAACACCCTGATTTGGAT 
CAAATATCACCAACAACTCTTAAGAACTGGACTAATTAATTGTTTTCTTG 
ATCTTGATAACAAGAGGAAACACGTCACCATATCTTTTATTTTAAATTTG 
CTTTTGGTGTATTTTCTTTCITCCCATTTCmTCITGATCTGTTCCAGAT 
GGTATTTGGTGTGGATAATTTACACCTGGAGATTGTGAACGATGGGAAGG 

40 GGTATGTGATTTACAGAGGATGTGGCTTGTGGTTGAGGATGGTTTATGGC 
TGGCCGAGTCTAATTTATATTTATATAAACAAATAAATATATAAAACAAG 
GGTAAAATATGTATTTAAGCGTCCTCTTTTAATGGTGACAATTTTTACAG 
TTTACTCTCTTTGTTTTTTAATTGTGATGCCCACGATCGAACTCATTCAT 
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CCCCCCCCCTTTTTTTTTTAAAATAAAAAATTAAGAAGGGGTACCACCAT 
ATACCCGTGTCAGCTTCTTATTCCCAAGCAGTCAAATAGGGACTTAGGTT 
GTATGGAAACAGTTCCGTGACTTGGATGGCAGATAAATTTAGTAAACTTA 
ACCCTTCAATTAACCTACCnTmCTTATTAACTCAATTTCAAGCTAAAT 

5 TCTGATTCTTGTTTGAAAATAAGTTGCATCITTATTTTTGCATATTATCT 
TGTTGCATAGGATCCTTAGCATCTTTTAATAGTTTATTTGAAGCTGAAAG 
ATCCAACTAGTTTTGATCTGTTGGCATTTTCCATCATTTGCAACTGTTTC 
TTGAAAAAAAATACCTAAAATCAAAATAACCATTTTCAAATCCAAAATTA 
TAAGAGAGAATTGTTAATGGACGTGGAATCATAAATCATTAACACAGTTC 

10 AGTACACAAGTTGCTAATTACATTTCTTGCTGTGCAGATTGAAATTCTAT 
CAGAGAAAGAGACATTACAAGAAGTCACTGATACTAATATTTCTAATGAT 
GTTGTATTATTCCCATCCTGTCTCATGCACTCTTTTCATAACGTCCATAA 
ACTTAAATTGGAAAATTATGAAGGAGTGGAGGTGGTGTTTGAGATAGAGA 
GTGAGAGTCCAACATGTAGAGAATTGGTAACAACTCACAATAACCAACAA 

15 CAGCCTATTATACTTCCCAACCTCCAGGAATTGTATCTAAGGAATATGGA 
CAACACGAGTCATGTGTGGAAGTGCAGCAACTGGAATAAATTCTTCACTC 
TTCCAAAACAACAATCAGAATCACCATTCCACAACCTCACAACCATAGAA 
ATGAGATGGTGTCATGGCTTTAGGTACTTGTTTTCGCCTCTCATGGCAGA 
ACTTCTTTCCAACCTAAAGAAAGTCAAGATACTTGGGTGTGATGGTATTG 

20 AAGAAGTTGTTTCAAACAGAGATGATGAGGATGAAGAAATGACTACATTT 
ACATCTACCCACACAACCACCAACTTGTTCCCTCATCTTGATTCTCTCAC 
TCTAAAATACATGCACTGTCTGAAGTGTATTGGTGGAGGTGGTGCCAAGG 
ATGAGGGGAGCAATGAAATATCTTTCAATAATACCACTACAACTACCGAT 
CAATTTAAGGTATGTTTGTACATATTTAATTATATATTTAATTTCCTTGT 

25 TAATTTCCTTTTCTTTGCAATATTCTATGCGAACTCAAGAATGGGATTTG 
GAGGCATATAAAGTTACATTCATTTGAACAAGTATTACCTTTTATTTGTT 
ATTTATCATTTTCATATCAAGTACCTATAACATTTCTTTTTTATTTTTCT 
AATTAGAAGAGGTCCACATGTCTAATTAGGTTTTCCATTCTATGTGTAAC 
CTCTATTCTCTCTGTAATCAAGCATCTTAGATTATTTATCCATTTTCATA 

30 ATTGTGTTTATTTTTACAGTTTTTTTTTTTAm 

TTTTAATTTATTTATTATTTTTTTTTTGGTAATTGCAACCTGTCATATAT 
TCAAGTCTTAATGTAACATAATAATACATTTTATACCCACTATACTAAGA 
TAATAATTACCTAAAGGGATGGATGCCATGACACTGCTACACTTCAGNAA 
CTCTAGTAAGGGCAGTTATGGAAGTTCAATAAAATGATAATGGCATCTTT 

35 TGATGGGTAATATAGGCAATTTAAGTTTTATTTCTGTTAAAGCAGTATTT 
AGCTAGTAGTGGCCAGTAGGAGAGGAGAATATCACCTTTTGTCAAAATCT 
GGTCATTGTACCCAGAATTTAGTTAAATGTAACATTTTAGATATTAGGGG 
TCATCAGGTGACAGATATTGTAGAATAGAACAATATGTAATATTACCCAA 
AACTATTTTTTCTAAGGTTGCTCTGTTAAATATGTGCTTTCTTGATTTCA 

40 TTGAATTTGCATTCGTATATTTTAGGTGGTAAACTGATTGTCTCTTCAAT 
AAATCCTGAAATTAATTAAAAAAAAAAAAACAAAAGTACATTTTTGATTT 
GGAGAGCACTGGTATCATTTAGTATAGAAAAAAACTAGATTTTGAATTAY 
CTTTCTTATATAAAAGTTGTGTATATAGTTTAATTAGTTTTACATCATTT 
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TTCTATGTGTTGTTGCAGTTGTCTGAAGCAGGTGGTGTTTGTTGGAGCTT 
ATGCCAATACTCTAGAGAGATAGAGATATATAGGTGTGATGCACTGTCAA 
GTGTAATTCCATGTTACGCAGCAGGACAAATGCAAAAGCTGCAAGTGCTG 
ACAGTCAGTTCTTGTAATGGTCTGAAGGAGGTATTTGAAACTCAATTAGG 

5 GACGAGCAGCAACAAAAACAACGAGAAGAGTGGTTGTGAGGAAGGAATTC 
CAAGAGTAAATAACAATGTTATTATGCTTCCCAATCTAAAGATATTGGAA 
ATCTACGGTTGTGGGGGTTTGGAACATATATTCACATTCTCTGCACTTGA 
AAGCCTGAGACAGCTCCAAGAGTTAACGATTAAGGGTTACTACTCTTGTC 
AATCTTCCAAACCTCAAAGAAATGAGGTTGGAGTGGCTAAGTAATCTGAG 

10 GTATATATGGAAGAGCAATCAGTGGACAGCATTTGAGTTTCCAAACCTAA 
CAAGAGTTGAAATTTGTGAATGTAATTCATTAGAACATGTATTTACTAGT 
TCCATGGTTGGTAGTCTATTGCAACTCCAAGAGCTACATATATTTAACTG 
CAGTCTGATGGAGGAGGTAATTGTTAAGGATGCAGATGTTTCTGTAGAAG 
AAGACAAAGAGAAAGAATCTGATGGCAAGACGAATAAGGAGATACTTGTG 

15 TTACCTCATCTAAAGTCCTTGAAATTACAACTTCTTCGAAGTCTTAAGGG 
GTTTAGCTTGGGGAAGGAGGATTTTTCATTCCCATTATTGGATACTTTAG 
AAATCAAAAGATGCCCAACAATAACCACCTTCACCAAAGGAAATTCCGCT 
ACTCCACAACTAAAAGAAATACAAACAAATTTTGGCTTCTTTTATGCTGC 
AGGGGAAAAAGACATCAACTCTCTTATAAAGATCAAACAACAGGTAAATC 

20 AGATCTTTGTTGCTTTAATAATTCTTAAACTACATTTGAAAAGCTTCATG 
CAAGTTTTTTTGTTATATTGTCAAAAACCGCAACCTACATTCAGCTTTAT 
ATTTATGTACTTTATGCAGGATTTCAAACAAGACTCAGATTAATGTGAAG 
TGA_\TATTAAAGGTAAATTATATTTTCATGTTCCTAGTTGCCTATTAATT 
AATGGCCTTTTAGTTCATGATTTTTGGATGTATTCTTCATGATGATGTGA 

25 ATCTTCTAATACCCCATTCATTGTTTGGTTGAATGTTGACTCTATGTCAG 

GATGAATATTCAAGGGAAGAATTGTTCATCAWATGAAGGACATTAAAGAA 
CATGGATGCTATGAAGATGTTGGGAAAACATATGTATCAAGTGGCAARCT 
GCTTAATGATCTAAGTTTGTTGGTTGANGATGTTGATTTTAATATTTCAA 
ATTCATTGGTTATATGGGCTTATCAATAGTGTTAATGGGATAATGAGTGA 

30 CTT.AACCTAAATTATGTTGTTGGTAAATGTTGGACAAGTATGGAAAATTA 
GGAATGACTTGTGAAAAAAAAATAAAAAAAAA (SEQ ID NO:94) 

RG2D deduced polypeptide sequence (SEQ ID NO:95) 

MA^IETANEIIKQVWVLMVPINDYLRYVVSCRKYISDMDLKMKELKEAKDNVEE 
35 HKNHNISNRLEVPAAQVQSWLEDVEKINAKVETVPKDVGCCFNLKIRYRAGRDAF 
NnEHDSVMRRHSLITWTDHPIPLGRVDSVMASTSTLSTEHNDFQSREVRFSEALKA 
LEANHMIALCGMGRVGKTHMMQRLKKVAKEKRKFGYIIEAVIGEISDPIAIQQVVA 
DYLCIELKESDKKTRAEKLRQGFKAKSDGGNTKFLnLDDVWQSVDLEDIGLSPSPN 
QG\T)FKVLLTSRDEHVCSVMGVEANSnNVGLLIEAEAQRLFQQFVETSEPELHKIG 
40 EDP.'RRCCGLPIAIKTMACTLRNKRKDAWKDALSRLQHHDIGNVATAVFRTSYENL 
PDKETKSVFU^CGLFPEDFmPTEELMRYGWGUCLFDRVYTnEARNRLNTCIERLV 
QANLLIGSDNGVHVKMHDLVRAFVLGMYSEVEQASIVNHGNMPGWPDENDMIVH 
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SCmSLTCKGMIEIPVDLKITKLTILKLMHGDKSLKFPQEFYEGMEKLQVISYDKM . 

KYPIXPIJ^PQCSTNIRVLHLTECSLKMFDCSSIGNLSNLEVLSFANSRIEWLPSTVRN 

UOOLRLLDLRFCDGLRIEQGVLKSLVKLEEFYIGNAYGFIDDNCKDMAERSYNLSA 

LEFAFFNNKAEVKNMSFEN1£RFKISVGCSFDGNISMSSHSYENMLQLVT1^GDVL 

DSKLNGLFLKTEVLFLSVHGMNDLEDVEVKSTHPTQSSSFCNLKVRnSKCVELRYL 

FKLHVANTLSSLEHLEVCGCENMEELIHTGIGGCGEETITFPKLKSLSLSQLPKLSGL 

CHhTWnGLPHLVDliCIXGIPGFmYPQNKLRTSSLLKEEVVIPKLETLQIDGMENL 

EEIWPCEl^GGEKVKLREKVSSCDKLVNLFPHNPMSLLHHLEELKVKNCRSIESLF 

NroLDCVSAIGEEDNKSILRRIKVKNLGKLREVWRIKGADNSRPUHGFPAVESISIW 

GCKRFRNIFTPITANFDLVALLEIHIGNYRENHESEEQIEILSEKETLQEVTDTNISND 

WLFPSCLMHSFHNLHKLKLENYEGVEVVFEIESESPTCRELVTTHNNQQQPnLPN 

LQELYLRNMDNTSHVWKCSNWNKFFTLPKQQSESPFHNLTTIEMRWCHGFRYLFS 

PUvlAElJ^NIJaCVKILGCDGffiEWSNRDDEDEEMTTFrSTHTTTN^ 

YNfflCIiCCIGGGGAKDEGSNEISFNOTTTTTDQFKLSEAGGVCWSLCQYSREIEIYRC 

DALSSVIPCYAAGQMQKLQVLTVSSCNGLKEVFETQLGTSSNKNNEKSGCEEGIPR 

VNNNVIMLPNLKILEIYGCGGLEHIFTFSALESLRQLQELTIKGYYTLVNLPNLKEM 

RLEWLSNLRYIWKSNQWTAFEFPNLTRVEICECNSLEHVFTSSMVGSLLQLQELHIF 

NCSUvIEEVIVKDADVSVEEDKEKESIXJKTNKEILVIJ'HIJCSLiCIXJLIJlSI^^ 

EDFSFPLLDTLEIKRCPTITTFTKGNSATPQLmQTNFGFFYAAGEJaDlNSLIKIKQQ 

DFKQDSD.CEVNK 

RG2E polynucleotide sequence (SEQ ID NO:96) 

TGGGAAGACACAATGATGCAAAGGTTGAAGAAGGTTGCTAAAGAAAATAGAAT 

GTTCAATTATATGGTTGAGGCAGTTATAGGGGAAAAGACAGACCCACTTGCTAT 

TCA.ACAAGCTGTAGCGGATTACCTTTGTATAGAGTTAAAAGAAAGCACTAAACC 

AGC.AAGAGCTGATAAGCTTCGTGAATGGTTTAAGGCCAACTCTGGAGAAGGTA 

AGA.ATAAGTTCCTTGTAATATTTGATGATGTTTGGCAGTCCGTTGATCTGGAAG 

ACATTGGTTTAAGTCATTTTCCAAATCAAGGTGTCGACTTCAAGGTCTTGTTGA 

CTTCACGAGACGAACATGTTTGCACAGTAATGGGGGTTGAAGCTAATTCAATTC 

TTA.ATGTGGGACTTCTAGTAGAAGCAGAAGCACAAAGTTTGTTCCAGCAATTTG 

TAG.AAACTTTTGAGCCCGAGCTCCATAAGATAGGAGAAGATATCGTAAGGAAG 

TGTTGTGGTTTACCTATTGCCATTAAAACCATGGCATGTACTCTAAGAAATAAA 

AGAAAGGATGCATGGAAGGATGCACTTTTGCATTTAGAGTACCATGACATTAGC 

AGTGTTGCGCCCAAAGTCTTTGAAACGAGCTACCATAATCTCCACAACAAGGAG 

TCGAGGAGTTGATGAGGTATGGATGGGGCTTAAAGATATTTGATAGAGTTTATA 

CTATTAGACAAGCAAGAATCAGGCTCAACACCTGCATTGAGCGACTGGTGCAG 

ACAAATTTGTTAATAGAAAGTGATGATGGTGTGCACGTCAAGATGCATGATCTG 

GTCCGTGCTTTCGTTTTGGTTATGTTTTCTGAAGTTGAACATGCTTCAATTATCA 

ACCATGGTAATATGCTTGGATGGCCTGAAAATTATATGACCAACTCTTGCAAAA 

CAATTTCATTAACATGCAAGAGTATGTCTGAATTTCCGGGAGATCTCAAGTTTC 

CAAACCTAACGATTTTGAAACTCATGCATGGAGATAAGTTGCTAAGATATCCTC 
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AAGACTTTTATGAAGGAATGGAAAAGCTCTGGGTTATATCATATGATGAAATGA 
AGTATCCATTGCTTCCCTCGTTACCTCAATGCTCCATCAACCTTCGAGTGCTTCA 
CCTCCATCGATGCTCATTAATGATGTrTGATTGCTCTTGTATTGGAAATATGTTG 
AATCTGGAAGTGCTTAGCTTTGTTAAATCTGGCATTGAATGGTTACCTTCCACA 
5 ATAGGAAATTTAAAGAAGCTAAGGTTACTTGATCTGAGAGATTGTTATGGTCTT 
CGTATAGAAAAAGGTGTCTTGAAAAATTTGGTGAAAATTGGAGGAATTTATATT 
GGTAGAGCAGATATTTTATAGAT 

RG2E deduced polypeptide sequence (SEQ ID NO:97) 

10 WEDTMMQRUCKVAKENRMFNYMVEAVIGEKTDPLAIQQAVADYLCIELKESTKP 
ARADKLREWFKANSGEGKNKFLVIFDDVWQSVDLEDIGLSHFPNQGVDFKVLLTS 
RDEHVCTVMGVEANSILNVGLLVEAEAQSLFQQFVETFEPELHKIGEDIVRKCCGL 
PIAIKTMACTLRNKRKDAWKDALLHLEYHDISSVAPKVFETSYHNLHNKETKSVFL 
MCGFFPEDFNIPIEELMRYGWGLKIFDRVYTIRQARIRLNTCIERLVQTNLLIESDDG 

15 VHXTCMHDLVRAFVLVMFSEVEHASIINHGNMLGWPENYMTNSCKTISLTCKSMSE 
FPGDIXFPNLTIUaJklHGDKLIJRYPQDFYEGNIEKLWVISYDEMKYFLLPSLPQCSI 
NLRVLHLHRCSLMMFDCSCIGNMLNLEVl^FVKSGffiWIJ>STIGNIJmjlLLDLRD 
CYGLRIEKGVLKNLVKIGGIYIGRADIL. 

20 RG2F polynucleotide sequence (SEQ ID NO:98) 

CTGTGGAAGACACAATGATGCAAAGGCTGAAAAAGGTTGTGCATGAAAAGAAA 
ATGTTTAACTTTATTGTTGAAGCAGTTATAGGGGAAAAGACAGACCCCGTTGCC 
ATTCAGGATGCTATAGCAGATTACCTAGGTGTAGAGCTCAATGAAAAATCTAAG 
CAAGCAAGAGCTGATAAGCTCCGTCAAGGATTCAAGGACAAATCAGATGGAGG 

25 CAA.^ATAAGTTCTTTGTAATACTTGACGATGTTTGGCAGTCTGTTGATCTGGA 
AGATATTGGTTTAAGTCCTTTTCCAAATCAAGGCGTCGACTTCAAGGTCTTGTT 
GACATCACGAGACAGACATGTTTGCACAGTGATGGGGGTTGAAGCCAAATTAA 
TTCTAAACGTGGGACTTCTAATTGAAGCTGAAGCACAAAGTTTGTTCCACCAAT 
TTGTTGTCACTTCTGAGCCCGAGCTCCATAAGATAGGAGAAGATATTGTAAAGA 

30 AGTGTTTCGGTCTGCCAATTGCCATCAAAACCATGGCATGTACTCTACGACATA 
AAAGAAAGGATGCATGGAAGGATGCACTTTCACGTTTAGAGCACCATGACATT 
CAAAGTGTTGTGCCTAAAGTATTTGAAACGAGCTACAACAATCTCAAAGACAA 
GGAGACTAAATCCGTATTTTTGATGTGTGGTTTGTTTCCTGAAGACTTGGATAT 
ACCTATCGAGGAGTTGATGAGGTATGGATGGGGCTTAAGATTATTTGATAGAGT 

35 TAATACTATTACACAAGCAAGAAACAGGCTCAACACCTGCATTGAGCGACTGG 
TGCACACAAATTTGTTAATTGAAAGTGTTGATGGTGTGCATGTCAAGATGCATG 
ATCTGGTTCGTGCTTTTGTTTTGGGAATGTTTTCTGAAGTGGAGCATGCTTCAAT 
TGTCAACCATGGTAATATGCCCGAGTGGACTGAAAATGATATGACTGACTCTTG 
CAAACAAATTTCATTAACATGCAAGAGTATGTTGGAGTTTCCTGGAGACCTCAA 

40 GTTTCCAAACCTAAAGATTTTGAAACTTATGCATGGAGGTAAGTCACTAAGGTA 
TCCTCAAGACTTTTATCAAGGAATGGAAAAGCTGGAGGTTATATCATACGATGA 
AATGAAGTATCCATTGCTTCCCTCGTTGCCTCAATGTTCCACCATCCTTCGAGTG 
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CTTCATCTCCATGAATGTTCATTAAGGATGTTTGATTGCTCTTCAATCGGTAATC. 

TTTTCAACATGGAAGTGCTCAGCTTTGCTAATTCTAGCATTGAATTGTTACCTTC 

CGTAATTGGAAATTTGAAGAAGTTGCGGCTGCTAGATrTGACAAACTGTTATGG 

TGTTCGTATAGAAAAGGATGTCTTGAAAAATTTGGTGAAACTTGAAGAGCTTTA 

TATTAGGAATGGTCTACCAGTTTACAGAGGAT 

RG2F deduced polypeptide sequence (SEQ ID NO:99) 

VEDTMMQRLKKWHEKKMFNFIVEAVIGEKTDPVAIQDAIADYLGVELNEKSKQA 

RADKLRQGFKDKSDGGKNKFFVILDDVWQSVDLEDIGLSPFPNQGVDFKVLLTSRD 

RHVCTVMGVEAKLILNVGLUEAEAQSLFHQFWTSEPELHKIGEDIVKKCFGLPIAI 

KTMACTUiHKRKDAWKDAI^RI^HHDIQSVWKVFETSYNNIjajKETKSVFm 

LFPEDLDIPffiELMRYGWGLRLFDRVNTTTQARNRLNTCIERLVHTNLLffiSVDGVH 

VKMHDLVRAFVLGMFSEVEHASIVNHGNMPEWTENDMTDSCKQISLTCKSMLEFP 

GDIJCFPNIJaLKlJVfflGGKSLRYPQDFYQGMEKLEVISYDEMKYPLLPSLPQCSTILR 

VLHLHECSUlMFDCSSIGNLFNMEVI^FANSSffilJJSVIGNLKKLRLLDLTNCYGV 

RIEKDVLKNLVKLEELYIRNGLPVYRG 

RG2G polynucleotide sequence (SEQ ID NO: 100) 

GAAGACACGATGATGAAGAACTGAAGGAGGTCGTGGGACAAAAGAAATCATTC 

AATATTATTATTCAAGTGGTCATAGGAGAGAAGACAAACCCTATTGCAATTCAG 

CAAGCTGTAGCAGATTACCTCTCTATAGAGCTGAAAGAAAACACTAAAGAAGC 

AAGAGCTGATAAGCTTCGTAAACGGTTTGAAGCCGATGGAGGAAAGAATAAGT 

TCCTTGTAATACTTGACGATGTATGGCAGTTTGTCGATCTTGAAGATATTGGTTT 

AAGTCCTCTGCCAAATAAAGGTGTCAACTTCAAGGTCTTGTTGACGTCAAGAGA 

TTCACATGTTTGCACTCTGATGGGAGCTGAAGCAAATTCAATTCTTAATATAAA 

AGTTTTAAAAGATGTAGAAGGACAAAGTTTGTTCCGCCAGTTTGCTAAAAATGC 

GGGTGATGATGACCTGGATCCTGCTTTCAATGGGATAGCAGATAGTATTGCAAG 

TAGATGTCAAGGTTTGCCCATTGCCATCAAAACCATTGCCTTAAGTCTTAAAGG 

TAGAAGCAAGTCTGCATGGGACGTTGCACTTTCTCGTCTGGAGAATCATAAGAT 

TGGTAGTGAAGAAGTTGTGCGTGAAGTTTTTAAAATTAGCTACGACAATCTCCA 

AGATGAGGTTACTAAATCTATTTTTTTACTTTGTGCTTrAm 

GATATTCCTACTGAGGAGTTGGTGAGGTATGGGTGGGGCTTGAAATTATTTATA 

GAAGCAAAAACTATAAGAGAAGCAAGAAACAGGCTCAACACCTGCACTGAGCG 

GCTTAGGGAGACAAATTTGTTATTTGGAAGTGATGACATTGGATGTGTCAAGAT 

GCACGATGTGGTGCGTGATTTTGTTTTGCATATATTCTCAGAAGTCCAACACGC 

TTCAATTGTCAACCATGGTAACGTGTCAGAGTGGCTAGAGGAAAATCATAGCAT 

CTACTCTTGTAAAAGAATTTCATTAACATGCAAGGGTATGTCTCAGTTTCCCAA 

AGACCTCAAATTTCCAAACCTTTCAATTTTGAAACTTATGCATGGAGATAAGTC 

ACTGAGCTTTCCTGAAAACTTTTATGGAAAGATGGAAAAGGTTCAGGTAATATC 

ATATGATAAATTGATGTATCCATTGCTTCCCTCATCACTTGAATGCTCCACCAA 

CGTTCGAGTGCTTCATCTTCATTACTGTTCATTAAGGATGTTTGATTGCTCTTCA 

ATTGGTAATCTTCTCAACATGGAAGTGCTCAGCTTTGCTAATTCTAACATTGAA 
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TGGTTACCATCTACAATTGGAAATTTGAAGAAGCTAAGGCTACTAGATTTGACA. 
AATTGTAAAGGTCTTCGTATAGATAATGGTGTCTTAAAAAATTTGGTCAAACTT 
GAAGAGCTTTATATGGGTGTTAATCGTCCGTATGGACAGGCCGTTAGCTTGACA 
GATGAAAA 

5 

RG2G deduced polypeptide sequence (SEQ ID NO:101) 

RHDDEELKEWGQKKSFNmQVVIGEKTNPIAIQQAVADYLSIELKENTKEARADKL 

RKRFEADGGKNKFLVILDDVWQFVDLEDIGLSPLPNKGVNFKVLLTSRDSHVCTL 

MGAEANSILNKVLKDVEGQSLFRQFAKNAGDDDLDPAFNGIADSIASRCQGLPIAI 

10 KTIAI^UCGRSKSAWDVALSRLENHKIGSEEVVREVFKISYDNIXJDEVTKSIFLLCAL 
FPEDFDIPTEELVRYGWGLKL^EAKTIREARNRLNTCTERLRET^fLLFGSDDIGCVK 
MHDWRDFVLHIFSEVQHASIVNHGNVSEWLEENHSIYSCKRISLTCKGMSQFPKDL 
KFPNLSILKLMHGDKSLSFPENFYGKMEKVQVISYDKLMYPLLPSSLECSTNVRVLH 
Un'CSUlMFDCSSIGlflXNMEVI^FANS^^EWIJSTIGNIJaa.RLIJ)LTNCKGU^ 

15 NG\TJCNLVKLEELYMGVNRPYGQAVSLTDE 

RG2H polynucleotide sequence (SEQ ID NO:102) 

TGA.\GGAGGTTGTGGAACGAAAGAAAATGTTCAGTATTATTGTTCAAGTG 
GTCATAGGAGAGAAGACAAACCCTATTGCTATTCAGCAAGCTGTAGCAGA 

20 TTACCTCTCTATAGAGCTGAAAGAAAACACTAAAGAAGCAAGAGCTGATA 
AGCTTCGTAAATGGTTCGAGGCCGATGGAGGAAAGAATAAGTTCCTTGTA 
ATACTTGACGATGTATGGCAGTTTGTCGATCTTGAAGATATTGGTTTAAG 
TCCTCTGCCAAATAAAGGTGTCAACTTCAAGGTCTTGTTGACGTCAAGAG 
ATTCACATGTTTGCACTCTGATGGGAGCCGAAGCCAATTCAATTCTCAAT 

25 ATA.\AAGTTTTAACAGCTGTAGAAGGACAAAGTTTGTTCCGCCAGTTTGC 
TAAAAATGCGGGTGATGATGACCTGGATCCTGCTTTCAATAGGATAGCAG 
ATAGTATTGCAAGTAGATGTCAAGGTTTGCCCATTGCCATCAAAACCATT 
GCCTTAAGTCTTAAAGGTAGAAGCAAGCCTGCGTGGGACCATGCGCTTTC 
TCGTTTGGAGAACCATAAGATTGGTAGTGAAGAAGTTGTGCGTGAAGTTT 

30 TTAAAATTAGCTATGACAATCTCCAAGATGAGATTACTAAATCTATTTTT 
TTACTTTGTGCTTTATTTCCTGAAGATTTTGATATTCCTACTGAGGAGTT 
GATGAGGTATGGATGGGGCTTGAAATTATTTATAGAAGCAAAAACTATAA 
GAGAAGCAAGAAACAGGCTCAACACCTGCACTGAGCGGCTTAGGGAGACA 
AATTTGTTATTTGGAAGCGATGACATTGGATGCGTCAAGATGCACGATGT 

35 GGTGCGTGATTTTGTTTTGCATATATTCTCAGAAGTCCAGCACGCTTCAA 
TTGTCAACCATGGTAACGTGTCAGAGTGGCTAGAGGAAAATCATAGCATC 
TACTCTTGTAAAAGAATTTCATTAACATGCAAGGGTATGTCTGAGTTTCC 
CAAAGACCTCAAATTTCCAAACCTTTCAATTTTGAAACTTATGCATGGAG 
ATAAGTCGCTGAGCTTTCCTGAAAACTTTTATGGAAAGATGGAAAAGGTT 

40 CAGGTAATATCATATGATAAATTGATGTATCCATTGCTTCCCTCATCACT 
TGAATGCTCCACTAACGTTCGAGTGCTTCATCTCCATTATTGTTCATTAA 
GGATGTTTGATTGCTCTTCAATTGGTAATCTTCTCAACATGGAAGTGCTC 
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AGCTTTGCTAATTCTAACATTGAATGGTTACCATCTACAATTGGAAATTT 
GAAGAAGCTAAGGCTACTAGATTTGACAAATTGTAAAGGTCTTCGTATAG 
ATAATGGTGTCTTAAAAAATTTGGTCAAACTTGAAGAGCTTTATATGGGT 
GTTAATCATCCGTATGGAC 

5 

RG2H deduced polypeptide sequence (SEQ ID NO:103) 

KEVVERKKMFSIIVQWIGEKTNPIAIQQAVADyi^IEIJKENTKEARADKlJJK\^^ 

DGGKNKFLVILDDVWQFVDLEDIGLSPLPNKGVNFKVLLTSRDSHVCTLMGAEAN 
SILNIKVLTAVEGQSLFRQFAKNAGDDDLDPAFNRIADSIASRCQGLPIAIKTIALSLK 

10 GRSKPAWDHALSRLENHKIGSEEWREVFKISYDNLQDEITKSIFLLCALFPEDFDIP 
TEEUvlRYGWGliMEAKTIREARNRLNTCTERIJlETNLIJ'GSDDIGCVKMro 
DFVnLHIFSEVQHASrVNHGNVSEWLEENHSIYSCKRISLTCKGMSEFPKDLKFPNLSI 
LKLMHGDKSLSFPENFYGKMEKVQVISYDKLMYPLLPSSLECSTNVRVLHLHYCSL 
RMFDCSSIGNLLNMEVLSFANSNIEWLPSTIGNLKKLRLLDLTNCKGLRIDNGVLKN 

15 LVKLEELYMGVNHPYG 

RG2I polynucleotide sequence (SEQ ID NO:104) 

AAGAAGAGCTGAAGGAGGTTGTGGAACAAAAGAAAACGTTCAATATTATT 

GTTCAAGTGGTCATAGGAGAGAAGACAAACCCTATTGCTATTCAGCAAGC 

20 TGTAGCAGATTCCCTCTCTATAGAGCTGAAAGAAAACACTAAAGAAGCAA 
GAGCTGATAAGCTTCGTAAATGGTTCGAGGCTGATGGAGGAAAGAATAAG 
TTCCTCGTNATACTTGACGATGTATGGCNGTTTGTTGATCTTGAAGATAT 
TGGTTTAAGTCCTCATCCAAATAAAGGTGTCANCTTCAAGGTCTTGTTGA 
CGTCAAGAGATTCACATGTTTGCACTCTGATGGGAGCTGAAGCCAATTCA 

25 ATTCTCAATATAAAAGTTTTAAAAGATGTAGAAGGAAAAAGTTTGTTCCG 
CCAGTTTGCTAAAAATGCGGGTGATGATGACCTGGATCCTGCTTTCATTG 
GGATAGCAGATAGTATTGCAAGTAGATGTCAAGGTTTGCCCATTGCCATC 
AAAACCATTGCCTTAAGTCTTAAAGGTAGAAGCAAGTCTGCATGGGACGT 
TGCACTTTCTCGTCTGGAGAATCATAAGATTGGTAGTGAAGAAGTTGTGC 

30 GTG-\AGTTTTTAAAATTAGCTATGACAATCTCCAAGATGAGGTTACTAAA 
TCTATTTTTTTACTTTGTGCTTTATTTCCTGAAGATTTTGATATTCCTAC 
TGAGGAGTTGGTGAGGTATGGGTGGGGCTTGAAATTATTTATAGAAGCAA 
AAACTATAAGAGAAGCAAGAAACAGGCTCAACACCTGCACTGAGCGGCTT 
AGGGAGACAAATTTGTTATTTGGAAGTGATGACATTGGATGCGTCAAGAT 

35 GCACGATGTGGTGCGTGATTTTGTTTTGCATATATTCTCAGAAGTCCAGC 
ACGCTTCAATTGTCAACCATGGTAATGTGTCAGAGTGGCTAGAGGAAAAT 
CATAGCATCTACTCTTGTAAAAGAATTTCATTAACATGCAAGGGTATGTC 
TGAGTTTCCCAAAGACCTCAAATTTCCAAACCTTTCAATTTTGAAACTTA 
TGCATGGAGATAAGTCGCTGAGCTTTCCTGAAAACTTTTATGGAAAGATG 

40 GAA.AAGGTTCAGGTAATATCATATGATAAATTGATGTATCCATTGCTTCC 
CTCATCACTTGAATGCTCCACCAACCTTCGAGTGCTTCATCTCCATGAAT 
GTTCATTAAGGATGTTTGATTGCTCTTCAATTGGTAATCTTCTCAACATG 
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GAAGTGCTCAGCTTTGCTAATTCTGGCATTGAATGGTTACCATCTACAAT 
TGGAAATTTGAAGAAGCTAAGGCTACTGGATCTGACAGATTGTGGAGGTC 
TTCATATAGATAATGGCGTCTTAAAAAATTTGGTCAAACTTGAAGAGCTT 
TATATGGGTGCTAATCGTCTGTTTGGAAAGTGCCAT 

5 

RG2I deduced polypeptide sequence (SEQ ID NO:105) 

EEUCEWEQKKTFNIIVQWIGEKTNPIAIQQAVADSI^IEUCENTKEARADKLRKWF 
EADGGKNKFLVILDDVW?FVDLEDIGLSPHPNKGV?FKVLLTSRDSHVCTLMGAEA 
NSILNIKVLKDVEGKSLFRQFAKNAGDDDLDPAnGIADSIASRCQGLPIAIKTIALSL 

10 KGRSKSAWDVAI^R1£NHKIGSEEVVREVFKISYDNLQDEVTKSIFLLCALFPEDFDI 
PTEELVRYGWGIJaJ^AKTIREARNRU^TCTERIJffiTM.LFGSDDIGCVK^ 
RDF\'LfflFSEVQHASIVNHGNVSEWLEENHSIYSCKRISLTCKGMSEFPKDLKFPNLS 
ILKLMHGDKSLSFPENFYGKMEKVQVISYDKLMYPLLPSSLECSTNLRVLHLHECSL 
RMFDCSSIGNLLNMEVLSFANSGIEWLPSTIGNLKKLRLLDLTDCGGLHIDNGVLKN 

15 LVKLEELYMGANRLFGKCH 



RG2J polynucleotide sequence (SEQ ID NO:106) and (SEQ ID NO:107) 

ATGTCCGACCCAACAGGGATTGTTGGTGCCATTATTAACCCAATTGCTCA 
AACGGCCTTGGTTCCCCTTACAGACCATGTAGGCTACATGATTTCCTGCA 

20 GAA.AATATGTGAGGGACATGCAAATGAAAATGACAGAGTTAAATACCTCA 
AGA-ATCAGTGCAGAGGAACACATTAGCCGGAACACAAGAAATCATCTTCA 
GATTCCATCTCAAATTAAGGATTGGTTGGACCAAGTAGAAGGGATCAGAG 
CGA,ATGTTGCAAACTTTCCAATTGATGTCATCAGTTGTTGTAGTCTCAGG 
ATCAGGCACAAGCTTGGACAGAAAGCCTTCAAGATAACTGAGCAGATCGA 

25 AAGTCTAACGAGACAAAATTCGCTGATTATCTGGACTGATGAACCTGTTC 
CCCTGGGAAGAGTTGGTTCCATGATTGCATCCACCTCTGCAGCATCAAGT 
GATCATCATGATGTCTTCCCTTCAAGAGAGCAAATTTTTAGGAAAGCACT 
AGA.AGCACTTGAACCCGTCCAAAAATCCCACATAATAGCCTTATGGGGGA 
TGGGCGGAGTGGGGAAGACCACGATGATGAAGAAGCTGAAAGAGGTCGTG 

30 GAACAAAAGAAAACGTGCAATATTATTGTTCAAGTGGTCATAGGAGAGAA 
GAC.AAACCCTATTGCTATCCAGCAAGCTGTAGCAGATTACCTCTCTATAG 
AGCTGAAAGAAAACACTAAAGAAGCAAGAGCTGATAAGCTTCGTAAACGG 
TTCGAAGCCGATGGAGGAAAGAATAAGTTCCTTGTAATACTTGACGATGT 
ATGGCAGTTTTTCGATCTTGAAGAtATTGGTTTAAGTCCTCTGCCAAATA 

35 AAGGTGTCAACTTCAAGGTCTTGTTGACGTCAAGAGATTCACATGTTTGC 
ACTCTGATGGGAGCTGAAGCCAATTCTATTCTCAATATAAAAGTTTTAAA 
AGATGTAGAAGGAAAAAGTTTGTTCCGCCAGTTTGCTAAAAATGCGGGTG 
ATGATGACCTGGATCCTGCTTTCATTGGGATAGCAGATAGTATTGCAAGT 
AGATGTCAAGGTTTGCCCATTGCCATCAAAACCATTGCCTTAAGTCTTAA 

40 AGGTAGAAGCAAGTCTGCATGGGACGTCGCACTTTCTCGTCTGGAGAATC 
ATA-AGATTGGTAGTGAAGAAGTTGTGCGTGAAGTTTTTAAAATTAGCTAT 
GACAATCTCCAAGATGAGGTTACTAAATCTATTTTTTTACTCTGTGCTTT 
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ATTTCCTGAAGATTTTGATATTCCTATTGAGGAGTTGGTGAGGTATGGGT 
GGGGCTTGAAATTATTTATAGAAGCAAAAACTATAAGAGAAGCAAGAAAC 
AGGCTCAACAACTGCACTGAGCGGCTTAGGGAGACAAATTTGTTATTTGG 
AAGTCATGACTTTGGGTGCGTCAAGATGCACGATGTGGTGCGTGATTTTG 

5 TTTTGCATATGTTTTCAGAAGTCAAGCATGCTTCAATTGTCAACCATGGT 
AACATGTCAGAGTGGCCAGAGAAAAATGATACCAGCAACTCTTGTAAAAG 
AATTTCATTAACATGCAAGGGTATGTCTAAGTTTCCTAAAGACATCAACT 
ATCCAAACCTTTTGATTTTGAAACTTATGCATGGAGATAAGTCGCTGTGC 
TTTCCTGAAAACTTTTATGGAAAGATGGAAAAGGTTCAGGTAATATCATA 

10 TGATAAATTGATGTATCCATTGCTTCCCTCATCACTTGAATGCTCCACTA 
ACGTTCGAGTGCTTCATCTCCATTATTGTTCATTAAGGATGTTTGATTGC 
TCTTCAATTGGTAATCTTCTCAACATGGAAGTGCTCAGCTTTGCTAATTC 
TAACATTGAATGGTTACCATCTACAATTGGAAATTTGAAGAAGCTAAGGC 
TACTAGATTTGACAAATTGTAAAGGTCTTCGTATAGATAATGGTGTCTTA 

15 AAA.\ATTTGGTCAAACTTGAAGAGCTTTATATGGGTGTTAATCGTCCGTA 
TGGACAGGCCGTTAGCTTGACAGATGAAAACTGCAATGAAATGGTAGAAG 
GTTCCAAAAAACTTCTTGCACTAGAATATGAGTTGTTTAAATACAATGCT 
CAAGTGAAGAATATATCCTTCGAGAATCTTAAACGATTCAAGATCTCAGT 
GGGATGTTCTTTACATGGATCTTTCAGTAAAAGCAGGCACTCATACGAAA 

20 ACACGTTGAAGTTGGCCATTGACAAAGGCGAACTATTGGAATCCCGAATG 
AACGGGTTGTTTGAGAAAACGGAGGTTCTTTGTTTAAGTGTGGGGGATAT 
GTATCATCTTTCAGATGTTAAGGTGAAGTCCTCTTCGTTCTACAATTTAA 
GAGTCCTTGTCGTTTCAGAGTGTGCAGAGTTGAAACACCTCTTCACACTT 
GGTGTTGCAAATACTTTGTCAAAGCTTGAGCATCTTAAAGTCTACAAATG 

25 CGATAATATGGAAGAACTCATACATACCGGGGGTAGTGAAGGAGATACAA 
TTACATTCCCCAAGCTGAAGCTTTTATATTTGCATGGGCTGCCAAACCTA 
TTGGGTTTGTGTCTTAATGTCAACGCAATTGAGCTACCAAAACTTGTGCA 
AATGAAGCTTTACAGCATTCCGGGTTTCACAAGCATTTATCCGCGGAACA 
AGTTGGAAGCATCTAGTTTGTTGAAAGAAGAGGTACATATACATATAGTT 

30 TATGTTAATACATTTTAAACAATCTTTTCAACTAAAAGTTTCAGAATATA 
TCTGTATTTTGATTGTATGATGTGTTAGTGTTTGGATGTGGCTATTAAAG 
GAT.AATTATTTGGCAGGTTGTGATTCCTAAGTTGGATATACTTGAAATTC 
ATGACATGGAGAATTTAAAGGAAATATGGCCTAGTGAGCTTAGTAGAGGT 
GAGAAAGTTAAGTTGAGAAAGATTAAAGTGAGAAATTGTGATAAACTTGT 

35 GAATCTATTTCCACACAATCCCATGTCTCTGCTGCATCATCTTGAAGAGC 
TTATAGTCGAGAAATGTGGTTCCATTGAAGAGTTGTTCAACATCGACTTG 
GATTGTGCCAGTGTAATTGGAGAAGAAGACAACAACAGCAGCTTAAGAAA 
CATCAATGTGGAGAATTCAATGAAGCTAAGAGAGGTGTGGAGGATAAAAG 
GTGCAGATAACTCTCGTCCCCTCTTTCGTGGCTTTCAAGTTGTTGAAAAG 

40 ATAATCATTACGAGATGTAAGAGGTTTACAAATGTATTCACACCTATCAC 
CAC.AAATTTTGATCTGGGGGCACTTTTGGAGATTTCAGTTGATTGTAGAG 
GAAATGATGAATCAGACCAAAGTAACCAAGAGCAAGAGCAGGTATGGATT 
TCA.ATTTTACTCTTTTACTTAATTAATGATTAAGCCCCTGCTTTTTAATA 
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AAAAGGGGACAAACCATTTCTTGACTTAATGTTGCAATACAAGTCATGTA 

TTTCATTATAGATCATGTATAAATGTGACTAATTTTTTTCATCGCCTAAC 
TTTTGTTGATAAATCATTAGAAATGTCACTAATTACTTTTTAGTATTTAT 

5 AAA-ATAACTACAAAACATGTTTTTTCATTATAGATCATGTATATATCAAC 
TAAAAATATTATTCCCTTACACAAAAAAAAAAGGTTCAAGAAAGCCTGTA 
TTTCGAAATAACTAAAAAGAAAATATTTGATATTCACTAAGAGAAATTTT 
TTTCTAAACATGATCGCAAATGATTAAAACTTAAATTAAAACTAAAAAGA 
TTTTTATATATGTTATNCAAAATTAAAATTTGAAATTAAGTTTATAATTC 

10 TNGTNTCACAAAGGGATATATATAGTAAAATATTATTTTTTTGCAGTCAT 
GCATAGTTGTATTTTTAAATGATTTATTAACGTGGTAGGAGTGGAAACCA 
CTC.\ATCTAGTAGACCCACTATCACATGTCACATCAGCTTTACATCTATT 
TTTCTTTCTCCTTTTTTCATCTTTTTAAACTCATAACACNTAAAANTANC 
ATATTTTCCAACACACTNAACTCATTGTCACATTATTATTTTTAATTTAA 

15 TTA.-\ATTNGAAAATTAAAATTAANTAAANCNTAACATTTTTTAATTAAAA 
AATATTAATCCAAATAAAAANTNCACGATAAATTAAAAANGTTTANTTTG 
GAA-AAAAANCC (SEQ ID NO:106) 
Sequence gap 

ATA.ACCCTTTCAAGGGTCAACTCAAGTCCAAGTTAAAGTCAAGGTCAAAA 

20 CCTTGGTTAAAGTCAACTTTGGTCAAAGTCAACATCTACTTGACTCACCT 
CACCGAGTTGGTCCACCAACTTGTCGAGTCCCTTAATCCACAAACTTCAA 
GAACTTCGATCCTACTCGTCGAGTCTTTCAAGAACTCTTCGAGTTTCCAT 
TACACAGAATCGGGACCTTTTGCTCATGACTCGCCGAGTTCATCCTTGAA 
CTTGTCGAGTCTAGCTTCATACGAGTTCGAGTGTTTAGTCCTTGACTCGT 

25 CGAGTTCTTCCTTGAACTCGTCGAGTCCATCTTCGTATAGTTGGGACATT 
GCCTTGAACTCACCGAGTTCATCATTGAACTCATCGAGTCCTTCGATCTT 
CAAGTCCATAATCCTGTCCATCTTGTTGAGTCCTCTTCTAGACTCAACCA 
GATTCCTCAGAAACAGAAAAGGTTAGGGAACCATTACCTGACTCGCCGAG 
TCCCAAGAACGAATCCCCGAGTCCCCCAATGTCCATGACCATACAATCGA 

30 TTTTCGTTGGGCTCATTGCATCCAAAGCATAGATCTAACCTCCTAGGGTC 
CATATTACACGTAAAGCTACGAACTTGACGTCCATGCATGGGGGATTTGG 
CTCAAATGGCATTAAAATGGGGTTTATCTGATGCATGGGACTCCCATGGC 
CAT.AAAGTTAACACCTTTATGCCATGGGAATCCTCAATGGTTCCATATCT 
GAAGTTAACACTCTACAATATGTTCTAAACCCGAAGGTGGCTTAGAAATG 

35 CCCCAAAATGGCAAGATTCAAGCCTTAAAGGAGATCTAACAAATGATAAG 
TCA.AGGTTCAAGCTTTTTACCTTGAATAAGCTGGAAATGAAGCAAAATCT 
CTGGATCCACTTGCTTCTTCAAGAACCCCCAAGCTTCCACTTCTTCCTTC 
AAGTTTCAAACAACTTTAAACACTCAAAAATGGCTCAAGAACACTCAAAA 
AGCTTTAGGGTTTCGAGTTAGGGCTTTTTGGAAGCGAGAGGGACGATGGG 

40 GGCTGAAATGAGGCTAGAAAAAGTGTTTAAATAGGGGGCAAACCCTAAAT 
ATTAGGGTTTCATCCAGGCAGCCCTACTCGTCGAGTCGGGCTCCCGACTC 
GTCGAGTAGGTCACTTAAAACCCGCGTCCATAATCCAGTCTACTCGACGA 
GTTGGGCCTCCAACTCGTCGATTCCGAGTGCAAAACGTTCAATTACTTAA 
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ATTTAAATATGTACCAGGAACCGGGTGTTACAGTTGAGACTTTATACCTC 
CATAAGATAGATCTAGGTGCACATAGCCTGGATCCACAAGCTCCATGTCA 
ACAAGCGACTCTTCAAGAAGTTCATTCTTCCTCCTTAAGCACCAAAAAAC 
ACACAAAATCACCATGAAGCTCAAGAAATACTCAAATAGAGGATAGGGTT 

5 TCGTTCGTAGGGTTAGAGAGGATGGAGGCTAGAGGAAATGAGGGATAGAG 
GCGAGTTAAGGTCTTTAAATAGGGTCCAAGACCCTAAATTAGGGTTTTAA 
TCTGGCCAGACGAACGCAGGGTGTTCCCAAATGCATATGTGTCCAAATTC 
TCGTGTGCGCCATGCGTACCTCCCTTGTACGCCATGTGTACCGGGTTTGG 
TCCAAACCCTTCTAACTTCAAATGATCATAACTTGCACCCCTTATCTGTT 

10 TTCGATGTTCTTTATATCCACGGAAAGGTAACAAGAAGCCCTATACTTCT 
ATAAACTTTATTTAATCTGAAAACCAACCGAAATTAAATCCAAAATTCAT 
AAAAGTCCCGAACCAACACATTTACCGATACCCTTGGGCTCCAAAACACA 
AATTGAAAACCCGGATCATCCAAACTACATCATCCACCTCCAAATGAGCC 
CAAACTCAATTATTCAAGGGTTCTAAGCCTGTTAATGCCCACTCCTCGAT 

15 TACCACCCCGCAATGGGAAACGATTCAAAACAGGGCGTTACATAATTTGT 
TGTGGTTTTGTATTTTTTATTTCCGGTGAAGGTGAAAGATCCAACTATTT 
TTAATCTGTTGGCATTTTCCATCATTTGCAACTGTTTCTTGAAAAAAAAA 
TACCTAAAATCAAAATAACCATTTTCAAATCCAAAATTATAAGAGAGAAT 
TGT.\AATGGACATGGAATCTTAAATCATTAACACAGTTCAGTACACAAGT 

20 TGCTAATTACATTTCTTGCTGTGCAGATTGAAATTCTATCAGAGAAAGAG 
ACATTACAAGAAGCCACTGACAGTATTTCTAATGTTGTATTCCCATCCTG 
TCTCATGCACTCTTTTCATAACCTCCAGAAACTTATATTGAACAGAGTTA 
AAGGAGTGGAGGTGGTGTTTGAGATAGAGAGTGAGAGTCCAACAAGTAGA 
GAATTGGTAACAACTCACCATAACCAACAACAGCCTGTTATATTTCCCAA 

25 CCTCCAGCATTTGGATCTAAGGGGTATGGACAACATGATTCGCGTGTGGA 
AGTGCAGCAACTGGAATAAATTCTTCACTCTTCCAAAACAACAATCAGAA 
TCCCCATTCCACAACCTCACAACCATAAATATTGATTTTTGCAGAAGCAT 
TAAGTACTTGTTTTCACCTCTCATGGCAGAACTTCTTTCCAACCTAAAGA 
AAGTCAATATAAAATGGTGTTATGGTATTGAAGAAGTTGTTTCAAACAGA 

30 GATGATGAGGATGAAGAAATGACTACATTTACATCTACCCACACAACCAC 
CATCTTGTTCCCTCATCTTGATTCTCTCACTCTAAGTTTCCTGGAGAATC 
TGAAGTGTATTGGTGGAGGTGGTGCCAAGGATGAGGGGAGCAATGAAATA 
TCTTTCAATAATACCACTGCAACTACTGCTGTTCTTGATCAATTTGAGGT 
ATGCTTTGTTCATATTCAATTATTTATTTAATTTCCTTTTTTATTT 

35 TATTCTATAAATAATACATTTTATACCCACTATACTAAGATAATAATTAC 
CTAGAGGGATGGATGCTATGACACAGCTGCTACACTTCAGAAACTCTAGT 
AAGGGCAGTTATGGAAGTTCAATAAAATGATAATGGCATCTTTTGATGGG 
TAATATAGGCAATTTAAGTTTTATTTCTGTTAAAGCAGTATTTAGCAAGT 
ACTGGCCAGTAGGAGAGGAGAATATCACCTTTTGTGAAAATCTGGTCATT 

40 GTACCCAGAATTTAGTTAAATGTAACATTTTAGATATCAGGGGACATCAG 
GTGACAGATATTGTAGAATAGAACAATATATAATATTACCCAAAACTATT 
TTTTCTAAGGTTTTTCTGTTAAATATGTGCTTTCTTGATTTCATTGAATT 
TGCATTCCTATATTTTAGGTGGTAAAGTGATTGTCTCTTCAATAAATCCC 
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GAAATTAATTAAAAAAAAAAAAAAACAAAAGTAAATTTTTGATATGGAGA 
GCACTGGTATCATTTAGTATATAAAAAAACTAGATTrTGAATTAAGTTTC 
TTATATAAAAGCTGTGTATATAGTTTAATTAGTTTTACATCATTTTTCCA 
TGTGGTGTTGCAGTTGTCTGAAGCAGGTGGTGTTTCTTGGAGCTTATGCC 

5 AATACGCTAGAGAGATAAGTATAGAATTCTGCAATGCATTGTCAAGTGTG 
ATTCCATGTTATGCAGCAGGACAAATGCAAAAGCTTCAAGTGCTGACAGT 
CAGTTCTTGTAATGGTCTGAAGGAGGTATTTGAAACTCAATTAAGGAGGA 
GCAGCAACAAAAACAACGAGAAGAGTGGTTGTGATGAAGGAAATGGTGGA 
ATTCCAAGAGTAAATAACAATGTTATTATGCTTTCTGGTCTGAAGATATT 

10 GGA.AATCAGCTTTTGTGGGGGTTTGGAACATATATTCACATTCTCTGCAC 
TTGAAAGCCTGAGACAGCTCGAAGAGTTAACGATAATGAATTGCTGGTCA 
ATG.\AAGTGATTGTGAA(jAAGGAAGAAGATGAATATGGAGAGCAGCAAAC 
AACAACAACAACGAAGGGGACTTCTTCTTCTTCTTCTTCTTCTTCTTCTT 
CTTCTTCTTCTTCTTCTTCTCCTCCTTCTTCTTCTAAGAAGGTTGTGGTC 

15 TTTCCTTGTCTAAAGTCCATTGTATTGGTCAATCTACCAGAGCTGGTAGG 
ATTCTTCTTGGGGATGAATGAGTTCCGGTTGCCTTCATTAGATGAACTTA 
TCATCGAGAAATGCCCAAAAATGATGGTGTTTACAGCTGGTGGGTCCACA 
GCTCCCCAACTCAAGTATATACACACAAGATTAGGCAAACATACTATTGA 
TCA-AGAATCTGGCCTTAACTTTCATCAGGTATATATGTTTCTTTAATTGG 

20 CATCATCTAATTAAGAAAGATATCATTCCTGCCAAGTAAATTTACTTCAA 
ACACATTCACACTGGTTTCAGTCTAAGTTTATGTTGTTCTAGGAAGGCCA 
AAATGGGAAAGCAAGATAGGGAAAAATAGTGTATTTCAGTGGAAAGGGTA 
TTTTAGGTATTTTCTGTCAAAAGTTGTTATTGCAGGCTTTTTAGTACCTG 
GAATCGTGTGTGGGAGGAGCATTATTATTCTGATTTGCTTGTTTCTTTAT 

25 CATTTTTTCTTAGCCTCTCGAACAGCTAGAAACCCTTTTAATCTTTTGAT 
TTT.AAATGACAAAATTTTTCCCTGTTACTCTATTTGATTGTTGTTCTTCA 
TGGTTCTAAGTGAGTTATTGGCTCATCTGTTACTTCTTTTGATTGTTATT 
TTCATAGCATGTTAGTCACTTGAATCAAGCTTTTTCATTTTCAACCAGGG 
CAAAAGGTCAAAAGTAACCTACTTTATGAGATCAAAAACAGCAACCCATC 

30 GGATAACTTTTAGTTGGAGTTAATAGTTACAATTACCATTGTGATTAATA 
ATTATAATATCCTGTATTAATTCATAAAAATTGGTACAGCACATATATGA 
CATTTCAAAGGTTTTTGTTTGACATATATATGCCTCTGGCGTTTTCTTTA 
TTGGACTTGCAGACCTCATTCCAAAGTTTATACGGTGACACCTTGGGCCC 
TGCTACTTCAGAAGGGACAACTTGGTCTTTTCATAACTTGATTGAATTAG 

35 ATGTGAAATTTAATAAGGATGTTAAAAAGATTATTCCATCCAGTGAGTTG 
CTGCAACTGCAAAAGCTGGAAAAGATAAATATAAACAGTTGTGTTGGGGT 

agaggaggtatttgaaactgcattggaagcagcagggagaaatggaaata 
gtggaattggttttgatgaatcgtcacaaacaactaccactactcttgtc 
aatcttccaaaccttagagaaatgaacttatggggtctagattgtctgag 
40 gtatatatggaagagcaatcagtggacagcatttgagtttccaaaactaa 
ca.agagttgaaattagtaattgcaacagtttagaacatgtatttactagt 
tccatggttggtagtctatcgcaactccaagagctacatataagtcagtg 
ca.aacttatggaggaggtgattgttaaggatgcagatgtttctgtagaag 
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AAGACAAAGAGAAAGAATCTGATGGCAAGATGAATAAGGAGATACTTGCG 
TTACCTAGTCTAAAGTCCCTGAAATTAGAAAGCTTACCATCTCTTGAGGG 
GTlTAGCrrGGGGAAGGAGGATTTTTCATrCCCATTATTGGATACTTTAA 
GAATTGAGGAATGCCCAGCAATAACCACCTTCACCAAGGGAAATTCCGCT 

5 ACTCCACAACTAAGAGAAATAGAAACAAGATTTGGCTCGGTTTATGCAGG 
GGA.-\GACATCAAATCCTCTATTATAAAGATCAAACAACAGGTAAATCAGA 
TCATTGTTGGTTTAATAATTCTTAAACTACATTTGAAAAGTTTCATGTAA 
GTTTTTTATTATTGTCAAAAGCCGCAACCrATATTTTCAACTTTATATTT 
ATGTACTTTATGCAGGATTTCAAAAAAGCCCAGGACTCTATTTAATGTGA 

10 AGTAAATACTAGAAGAGGTAAATTCTATTTACATGTCTCCTGATTGCCTA 
TTAATTAATGGCCTTTCAGTTCATGGTTTTTGGATGTATTCTTCATGATG 
ACGTGAATGTTTAAATACCCCACTAGTTAATTGTTAGGTTGAATGTTGAT 
GACCAAAGGACTATATGTCGGGAAGAATATTCAAGGAAAGAATTGTTCAT 
CATATGAAGGGCATTAAATTAAGAAGAACATGGATGCTATGAAGATGTTG 

15 GGA.\AATATATGAATCAAATAACAAGCTACTCACTTATCTAAGTTTGTTG 
GTTGAGGATGTTGATTTTAATATTTCAAATTCATTGGTATCATTATATGG 
GTTTATCAGTAGTGTTAATGGGATAATGAGCAACTTAACCTTAAATTATG 
CTGTTGGTAAATGTTGGACTCAAGTATGGAAAATTAGGAATAACTTGTGA 
AAA.\TATATGCAAAAGTAGGATTGAGATTTTCAATGAAAAAAATTATGAA 

20 ACTATACTACTATAGTATATAAATAAATTCAACTTACTGTTGGGTATATT 
GGAAGCACATATCATGAAAGTAACTAGAAGCAGAATTrGTTCCCATCTTC 
ATCTACTTATAGTTTCCATTTCTTACTTGTAAAAATCTGATTAAACTTTA 
GAGTTATTTCTATTTTTTACCAACCAAAATTTTCATATAAAGGCCACAAG 
T (SEQ ID NO:107) 

25 

RG2J deduced polypeptide sequence (SEQ ID NO:108) 

MSDPTGIVGAIINPIAQTALWLTDHVGYMISCRKYVRDMQMKMTELNTSRISAEEH 

ISRNTRNHLQIPSQIKDWLDQVEGIRANVANFPIDVISCCSLRIRHKLGQKAFKITEQI 

ESLTRQNSLIIWTDEPVPLGRVGSMIASTSAASSDHHDVFPSREQIFRKALEALEPVQ 

30 KSHIIALWGMGGVGKTTMMKKLKEVVEQKKTCNirVQVVIGEKTNPIAIQQAVADY 
L^IEIJCENTKEARADKIJIKRFEADGGKNKFLVILDDVWQFFDLEDIGLSPLPNKGV 
NFKVLLTSRDSHVCTLMGAEANSILNIKVLKDVEGKSLFRQFAKNAGDDDLDPAH 
GIADSIASRCQGLPIAIKTIALSLKGRSKSAWDVALSRLENHKIGSEEVVREVFKISYD 
NLQDEVTKSIFLLCALFPEDFDIPIEELVRYGWGLKLFIEAKTIREARNRLNNCTERL 

35 RETNLLFGSHDFGCVmHDVVRDFVLHMFSEVKHASIVNHGNMSEWPEKNDTSN 
SCKRISLTCKGMSKFPKDINYPNLULKLMHGDKSLCFPENFYGKMEKVQVISYDKL 
MYPLLPSSLECSTNVRVLHLHYCSLRMFDCSSIGNLLNMEVLSFANSNIEWLPSTIG 
NUCKLRLI-DLTNCKGLRIDNGVLKNLVKLEELYMGVNRPYGQAVSLTDENCNEM 
VEGSKKLLALEYELFKYNAQVKNISFENLKRFKISVGCSLHGSFSKSRHSYENTLKL 

40 AIDKGELLESRMNGLFEKTEVLCLSVGDMYHLSDVKVKSSSFYNLRVLVVSECAEL 
KHLFTLGVANTLSKLEHLKVYKCDNMEELIHTGGSEGDTITFPKLKLLYLHGLPNL 
LGLCLNVNAIELPKLVQMKLYSIPGFTSIYPRNKLEASSLLKEEVVIPEELIVEKCGSI 
EEIJFNmLDCASVIGEEDNNSSLRMNVENSMKLREVWRIKGADNSRPLFRGFQVVE 
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KmTRCKRFTNVFTPITTNFDLGALLEISVDCRGNDESDQSNQEQEQIEILSEKETLQE 
ATDSISNVVFPSCIJVIHSFHNIXJKULNRVKGVEVVFEIESESPTSRELVTTHHNQQQP 
VIFPNLQHU)LRGMDNMmVWKCSNWNKFFrU»KQQSESPFHNLTTINroFCRSIKY 
LFSPLMAELLSNLKKVNIKWCYGIEEVVSNRDDEDEEMTTFrSTHTTTILFPHLDSL 

5 TLSFLENLKCIGGGGAKDEGSNEISFNNTTATTAVLDQFELSEAGGVSWSLCQYAR 
EISIEFCNALSSVIPCYAAGQMQKLQVLTVSSCNGLKEVFETQLRRSSNKNNEKSGC 
DEGNGGIPRVNNNVIMLSGIJamSFCGGI^HIFrFSALESUlQLEELTIMNCWSMK 
VIVKKEEDEYGEQQTTTTTKGTSSSSSSSSSSSSSSSSPPSSSKKVVVFPCLKSIVLVNLP 
ELVGFFLGMNEFRLPSLDELIIEKCPKMMVFTAGGSTAPQLKYIHTRLGKHTIDQES 

10 GLNFHQDIYMPLAFSLLDLQTSFQSLYGDTLGPATSEGTTWSFHNUELDVKFNKD 
\TaaiPSSELLQLQKLEKININSCVGVEEVFETALEAAGRNGNSGIGFDESSQTTTTTL 
VNLPNLREMNLWGLDCLRYIWKSNQWTAFEFPKLTRVEISNCNSLEHVFTSSMVGS 
I^QLQELHISQCKLMEEVIVKDADVSVEEDKEKESDGKMNKEILALPSLKSLKLESL 
PSLEGFSLGKEDFSFPLLDTLRIEECPAITTFTKGNSATPQLREIETRFGSVYAGEDIKS 

15 SIKIKQQDFKKAQDSI.CEVNTR 

RG2K polynucleotide sequence (SEQ ID NO:109) and (SEQ ID NO:110) 

TGGGATTCCATATATAAAAACATATATTTTTATAAAGTGGGATTCCATTG 
TTTATATAGATTTTTATTCACCAATAGACAATAGATTAAAAAAAGATATA 

20 AAAACATGTCGGCTTTTGACTAAAAATATAGATTTTTATGAATAGAATAT 
TCAATTTGCTTAACTCGTTTAAAAAAAATGAAAAAGATGTCGATATAAAA 
TCTCATATGGGCCTTCTTTACCATTCAAATAGTAAAATAGTAAAAGATAC 
TTGTTTGGGGCATGAACTGACCATAGTCAAACCCATACAAAATCAAACGA 
ATCCCACATGGATGATGACGATGGGGTCGCAGTAAATGTGTTTTGGTCCT 

25 TTTTTTTCGAGAGAACAGAAGCTTCTGCTCTTCATCTTCTTTAGATTTTG 
GGGATTTTCTGGTTTCAGGGGTTTGTGAGTGGAAACTAAATTGAAGCAAA 
AAAGTATGGTATAATTGGTTGCTAGTGAAATTGATGCTTTCTATTACTAT 
CATCTTTAAAATTGTCAAAACATTATGTATTAAATTATGAGATCGAAAGT 
GGTCTATGGGCCAAAGGTAATACAAGCTTACTCAATGAAATGAATCTAGG 

30 ATGCATCATGCATGTATTGGTTAGATTAAAGATTTTCATCAAATTTCCTT 
TATCAAATTGTTGTATACCATGTTATGTAGGTGCTACCACAAGCCATAAC 
ATCGAGCAATGGAGTGTATTACTGGCATCTTTAGCAACCCGTTTGCTCAG 
TGTCTCATCGCTCCTGTGAAAGAACACCTTTGCCTTCTGATTTTCTATAC 
ACAATATGTAGGGGATATGCTTACTGCAATGACGGAGTTGAATGCTGCAA 

35 AAGACATTGTTGAAGAGCGGAAGAATCAAAACGTAGAAAAATGTTTTGAG 
GTTCCAAACCATGTCAACCGTTGGTTGGAAGATGTTCAAACAATCAACAG 
AAAAGTGGAACGTGTTCTTAACGATAATTGCAATTGGTTCAATCTATGTA 
ATAGGTACATGCTCGCAGTGAAAGCCTTGGAGATAACTCAGGAGATCGAT 
CATGCCATGAAACAACTCTCTCGGATAGAATGGACTGATGATTCAGTTCC 

40 TTTGGGAAGAAATGATTCCACAAAGGCATCCACCTCTACACCATCAAGTG 
ATTACAATGACTTCGAGTCAAGAGAACACACTTTTAGGAAAGCACTTGAA 
GCACTTGGATCCAACCACACATCCCACATGGTAGCCTTATGGGGGATGGG 
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TGGAGTTGGGAAGACCACGATGATGAAGAGGCTGAAAAATATTATTAAAG 
AAAAGAGGACGTTTCATTATATTGTTTTGGTGGTTATAAAGGAAAATATG 
GATCTCATTTCCATCCAGGATGCTGTAGCAGATTATCTGGATATGAAGCT 
AACAGAAAGCAATGAATCAGAAAGAGCCGATAAACTTCGTGAAGGGTTTC 

5 AGGCCAAATCAGATGGAGGTAAGAATAGGTTCCTCATAATACTGGATGAT 
GTATGGCAATCTGTTAATATGGAAGATATTGGTTTAAGTCCTTTTCCGAA 
TCAAGGTGTCGACTTCAAGGTCTTGTTGACCTCGGAAAACAAAGATGTTT 
GTGCAAAAATGGGAGTTGAAGCTAATTTAATTTTCGACGTGAAATTCTTA 
ACAGAAGAAGAAGCACAAAGTTTGTTTTATCAATTTGTAAAAGTTTCTGA 

10 TACCCACCTTGATAAGATTGGAAAAGCTATTGTAAGAAACTGTGGTGGTC 
TACCCATTGCCATCAAAACCATAGCCAATACTCTTAAAAATAGAAACAAG 
GATGTATGGAAGGATGCACTTTCTCGTATAGAGCATCATGACATTGAGAC 
AATTGCACATGTTGTTTTTCAAATGAGCTACGACAATCTCCAAAACGAAG 

15 ATTCCTACTGAGGAATTGGTGAGGTATGGATGGGGATTGAGAGTATTTAA 
TGGAGTGTATACTATAGGAGAAGCAAGACACAGGTTGAACGCCTACATCG 
AGCTGCTCAAGGATTCTAATTTATTGATTGAAAGTGATGATGTTCACTGC 
ATC.\AGATGCATGATTTAGTTCGTGCTTTTGTTTTGGATACGTTTAATAG 
ATTCAAGCATTCTTTGATTGTTAACCATGGTAATGGTGGTATGTTAGGGT 

20 GGCCTGAAAATGATATGAGTGCCTCATCTTGCAAAAGAATTTCATTAATA 
TGC.\AGGGCATGTCCGATTTTCCTAGAGACGTAAAGTTTCCAAATCTCTT 
GATTTTGAAACITATGCATGCAGATAAGTCTTTGAAGTTTCCTCAAGACT 
TTTATGGAGAAATGAAGAAGCTTCAGGTTATATCATACGATCACATGAAG 
TATCCCTTGCTTCCAACATCACCTCAATGCTCCACCAACCTTCGTGTGCT 

25 TCATCTTCATCAATGCTCATTGATGTTTGATTGCTCTTCTATTGGAAATC 
TGTTGAATCTGGAAGTGCTCAGCTTTGCTAATTCTGGTATTGAGTGGTTG 
CCTTCCACAATCGGAAATTTGAAGGAGCTAAGGGTACTAGATTTGACAAA 
TTGTGATGGTCTTCGTATAGATAATGGTGTCCTAAAGAAATTGGTGAAAC 
TTG.\AGAGCTTTATATGAGAGTTGGTGGTCGATATCAAAAGGCCATTAGC 

30 TTCACTGATGAAAACTGCAATGAAATGGCAGAGCGTTCAAAAAATCTTTC 
TGCATTAGAATTTGAGTTCTTCAAAAACAATGCTCAACCAAAGAATATGT 
CATTTGAGAATCTTGAACGATTCAAGATCTCAGTGGGATGTTATTTTAAG 
GGAGATTTCGGTAAGATCTTTCACrCTTTTGAAAACACGTTGCGGTTGGT 
CACCAACAGAACTGAAGTTCTTGAATCTAGGCTTAATGAGTTGTTTGAGA 

35 AAACAGATGTTCTTTATTTAAGTGTGGGAGATATGAATGATCTTGAAGAT 
GTTGAGGTAAAGTTGGCACATCTTCCTAAATCCTCTTCCTTCCACAATTT 
AAGAGTCCTTATCATTTCTGAGTGTATAGAGTTGAGATACCTTTTCACAC 
TTGATGTTGCAAACACTTTGTCAAAGCTTGAGCATCTTCAAGTTTACGAA 
TGCGATAATATGGAAGAAATCATACATACAGAGGGTAGAGGAGAAGTGAC 

40 AATTACATTCCCAAAGCTGAAGTTTTTATCATTGTGTGGGCTACCAAATC 
TGTTGGGTTTGTGTGGTAATGTGCACATAATTAATCTACCACAACTCACA 
GAGTTGAAACTTAATGGCATTCCAGGTTTCACAAGCATATATCCTGAAAA 
AGATGTTGAAACATCTAGTTTGTTGAATAAAGAGGTAAATGTGTTTTATG 
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TTAATACAATACAATCTTTTCAATTAACCGTTTCAAAATATATTGTATGA 
TTTATTTTTGTTTGGATGGGGTTATTAATGGGTGATTATTTCTCAGGTTG 
TAATTCCTAATTTGGAGAAACTTGATATTAGTTATATGAAGGATTTGAAA 
GAGATATGGCCTTGTGAATTAGGGATGAGTCAGGAAGTTGATGTTTCTAC 

5 GTTGAGAGTGATTAAAGTAAGCAGTTGTGATAATCTTGTGAATCTATTCC 
CGTGCAATCCTATGCCATTGATACATCACCTTGAAGAGCTTCAAGTGATA 
TTTTGTGGTTCCATTGAAGTGTTATTCAACATTGAGTTGGATTCTATTGG 
TCAAATTGGAGAAGGCATCAACAATAGCAGCTTGAGAATCATCCAATTGC 
AGAACTTAGGGAAGCTAAGTGAGGTGTGGAGGATAAAAGGTGCGGATAAC 

10 TCTAGTCTTCTCATCAGTGGCTTTCAAGGTGTTGAAAGCATTATCGTTAA 
CAAATGCAAGATGTTTAGAAATGTATTCACACCTACCACCACCAATTTTG 
ATCTGGGGGCACTTATGGAGATTCGGATACAAGATTGTGGAGAAAAGAGG 
AGAAACAACGAATTGGTAGAGAGTAGCCAAGAGCAAGAGCAGGTATGGCT 
TTCAATTTCACTTTCTTACTTAATGAAGGATTAAGCTCCTGCTTTTTGAA 

15 TAA-AAAGTGGATGAATGACTAAATTCGGGAATGCCACCCGGAAAGTTATC 
AACCATTTAGCTACACCATTTTTTGAACTAATGTTGCAATAAATGCATAA 
TATAATTAAAAAATGGTCATTGATAAATGTAAACCAACCTTTTTTATTTA 
TTAAAATGTCTACAATAAATGATTTTCTTTATTATATATCATTTTATAAC 
AATAAGCTTAAAGATGTTTAAATAGCCAATGTCAGTTATAGATCGTAACT 

20 AATTTTTTATTAACTAGTTTTAGTTAAGATATCACTCATTATTATTTTTA 

TAGAAAAAAGACAAGATTGGCTAATCCTCATAAGAATTTGGAAGATTTAA 
GCAAAATATAGAGCTTTTCCAAACATAGCCAATAGTTTCTTTTGCAGGTC 
CCATCTACGAAATTATCAATAGATTTGCGATTTTTTTTTGGCACCCGGGA 
AATTTCCATTAATTAAAAAAAAGTTCAAGCCATTTTGTAGTTGGCACCTG 

25 CAA.\ATGGTAGTTTGCACCTGCGGAAATCACCTTTCACCATTTCGCATCT 
ATGACTTGTGAAAATGTTAATTTGTGAAATGGTCATGTGCACCTCATGAG 
AAATACGAAATGGTCAGTAATATGACTTTTTTATATAAATATGATGGTGG 
CATATATTTATAGGAAAATATAGCTGCACGATATTAATTAATAGTGAAAT 
TAGTTAACTGTATACGATAAGTATACAAAATTTATATGTATGAAGTATAC 

30 TCA-ATTTAGGACGACTCGGGCAATGAAATCATCATTTAATAGGAGCAATG 
AAATCATTTTCGAAAAATGTTTACAAATGAATAAAATATTAAATTAAACT 
TAA,\ACATTTTGTTAGTAGTTTGAAATTTACAAACTGAAATTTGTTGTAT 
TTATTAACATTTATAAATGTTGTACTATGATTTTTTCCTTGTTTGCAAAT 
ATTCCTTAAAAATCCACCTAAAATCAAAATAATTAATCTTTTTCAAGTTG 

35 AAAAATGAAAATCGTATGATATAACCGTGTATGGATGTGGAATTATATAT 
CAGTTACTAATTACATTTTTTGTTGGGATATATGTGCGCAGATTGATATT 
GCAATCCCATTCACTCTCACACACTCTTTCCAAAACCTCCGTAAACTTGC 
TTTGGAAAAGTATGAAGGAGTGGAGGTGGTGTTTGAGATAGAGAGTCCAA 
CAAGTAGAGAATTGATAACAATTCACCATAATCAACAACCACTACTTCCC 

40 AACCTTGAGTTATTGGATATAAGTTTTATGGACAGCATGAGTCATGTATG 
GAAGTGCAACTGGAATAAATTCTTCATTCTTCAAAAACAACAGTCAGAAT 
CCCCATTCTGTAATCTCACAACCATACATATTCAATATTGCCAAAGCATT 
AAGTACTTGTTTTCAACTCTCATGGCAAAACTTCTTTCCAACCTAAAGAA 
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GGTCGAGGTAAGAGAGTGTCATGGTATTGAAGAAGTTGTTTCGAACAGAG 

ATGATGAAGATGAGGAAAAGACTACATTTACATCTACATCTTCTGAAAAA 

AGCACTAATTTGTTCCCTCGTCTTGAATCTCTCGCTCTTTATCAACTTCC 

AAATCTCAAGTGTATTGGTGGTGGTGGTTCTGCCAACAGTGGGAACAATG 

AAATATCTCTTGATAATTCCACTACTACTACTTCTTTTGTTGATCAATCT 

AAGGTATGTTTTTTTTTTTNGTTNCCCTT (SEQ ID NO:109) 

Sequence gap 

CCTCCCTAATAATACATGTTATGCACACTATACTAACATATTAGACACGT 

AAAGGATAAATGCTATGCCTCATATAATACGTTATATTTATAATCTTTAA 

ACAATCAAATTTATTAAACAAATAACTAAGTGTGAGCAAAGGCAGGTACC 

CGACTAAATTGCCCAAAACCAGTCTGGTGGTTCGTGGAATGTTGGGCCAG 

GTCGTTAAAACGTCTACACACCGGTTCTTTAAATCACAGATCCGCTTCTC 

ATACTGTGAACCCGGTTTTAATTTTAAAAGAAAATTTCATTATAAAGTAA 

ATGACTTAAACCATTACAAACAACAAAAATTTACCATTACAATGTTGGAC 

TATCATTATTTGCAACATAAAACTGAAAATACACATATTTCCTTCTGATA 

TCAGCATGAGTGGCTGGTTGGCTAACCCAAAAATCCATGCATTGTAGATG 

TGTGTTACAACACATAGTATCAATGAAAGGCATATTTTTAGGCTAGAATT 

TAACAATCTGTAATAATATTCCCTAAAACTAATATCATCATCAACCAACT 

AATATAAAACCATTGGGTTCGTCATTTTAGGTACAAAACATAGATTTTTC 

TAAGCTTGTTGTATTTAAACATATGCTTTCTAAACTTAATTGATTTTGCA 

TTCCAAAATTTTAGGTTGTAAAGTGGTATGTCATTTGTTGTCTTTTCAAC 

ATT.AATTGTACAAAAACCAAAACTACATAATTGATGTAGATATCATAACA 

ATTGTGTTATTTAGTATATAAAAACTAAATTTTGAATTGAATTTCTTATA 

CAA-AAGTTGTGTCTATGTATACATGTTTATGTAGGTAATAGACAATTAGT 

CTCTGTTAAGTATATGGAGTTTAATTTTTAGACTAATTTTTCATGTGTTG 

CAGTTTTATCAGGCAGGTGGCGTTTTTTGGACGTTATGCCAATACTCCAG 

AGAGATAAATATAAGGGAGTGTTATGCATTGTCAAGTGTAATTCCATGTT 

ATGCAGCAGGACAGATGCAAAATGTTCAAGTGCTGAATATATACAGGTGC 

AACTCAATGAAGGAGTTATTTGAAACTCAAGGGATGAACAACAACAATGG 

TGACAGTGGTTGTGATGAAGGAAATGGTTGTATACCAGCAATTCCAAGAC 

TAA.ATAACGTTATTATGCTACCCAATCTAAAGATATTGAAGATTGAAGAT 

TGTGGTCATCTGGAACATGTATTCACATTCTCTGCACTTGGAAGCCTGAG 

ACAGCTCGAAGAGTTAACGATAGAGAAATGCAAGGCAATGAAAGTGATAG 

TGAAGGAAGAAGATGAATATGGAGAGCAAACAACAAAGGCATCTTCGAAG 

GAGGTTGTGGTCrTTCCTCGTCTCAAGTCCATTGAACTGGAAAATCTACA 

AGAGCTCATGGGTTTCTACTTAGGGAAGAATGAGATTCAGTGGCCTTCAT 

TGGATAAGGTTATGATCAAGAATTGCCCAGAAATGATGGTGTTTGCACCT 

GGTGAGTCCACAGTTCCCAAGCGCAAGTATATAAATACAAGCTTTGGCAT 

ATATGGGATGGAGGAGGTACTTGAAACTCAAGGGATGAACAACAATAATG 

ATGACAATTGTTGTGATGATGGAAATGGTGGAATTCCAAGACTAAATAAC 

GTTATTATGTTTCCAAATATAAAGATATTGCAAATCAGCAATTGTGGCAG 

TTTGGAACATATATTCACATTCTCTGCACTTGAAAGCCTGATGCAGCTCA 

AAGAGTTAACAATAGCGGATTGCAAGGCAATGAAAGTGATTGTGAAGGAG 
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GAATATGATGTAGAGCAAACAAGGGTATTGAAGGCTGTGGTATTTTCTTG 
TCTAAAGTCCATTACACTATGCCATCTACCAGAGTTGGTGGGTTTCTTCT 
TGGGGAAGAATGAGTTCTGGTGGCCTTCATTGGATAAGGTTACCATCATT 
GATTGCCCACAAATGATGGGGTTCACACCTGGTGGGTCAACAACTTCCCA 
5 CCTCAAGTACATACACTCAAGCTTAGGCAAACATACTCTTGAATGTGGCC 
TTAATTTCAAGTCACAACTACTGCATATCATCAGGTATAATTATTATTCT 
TTNACACCATCTAATTATGGAATCATGACGCTAATTACAGTATTAAACAC 
(SEQ ID NO:110) 

10 RG2K deduced polypeptide sequence (SEQ ID NO:lIl) 

MECITGIFSNPFAQCUAPVKEHLCIJLIFYTQYVGDMLTAMTELNAAKDIVEERK 
NQNVEKCFEVPNHVNRWLEDVQTINKKVERVLNDNCNWFNLCNRYMLAVKAL 
EITQEIDHAMKQLSRIEWTDDSVPLGRNDSTKASTSTPSSDYNDFESREHTFRKAL 
EALGSNHTSHMVALWGMGGVGKTTMMKRLKNIIKEKRTFHYIVLWIKENMDL 

15 ISIQDAVADYLDMKLTESNESERADKLREGFQAKSDGGKNRFUILDDVWQSVN 
MEDIGLSPFPNQGVDFKVLLTSENKDVCAKMGVEANLIFDVKFLTEEEAQSLFY 
QFVKVSDTHLDKIGKAIVRNCGGLPIAIKTIANTLKNRNKDVWKDALSRIEHHD 
lETLUIVVFQMSYDNLQNEEAQSIFLLCGLFPEDFDIPTEELVRYGWGLRVFNGV 
YTIGEARHRLNAYffilXKDSNUJESDDVHCIKMHDLVRAFVLDTFNRFKHSUV 

20 NHGNGGMLGWPENDMSASSCKRISUCKGMSDFPRDVKFPNLULKLMHADKS 
LKFPQDFYGEMKKLQVISYDHMKYPLLPTSPQCSTNLRVLHLHQCSLMFDCSSI 
GNLLNLEVI^FANSGIEWLPSTIGNLKELRVLDLTNCDGLRIDNGVLKKLVKLEELY 
MR\GGRYQKAISFTDENCNEMAERSKNLSALEFEFFKNNAQPKNMSFENLERFKIS 
VGCYFKGDFGKIFHSFENTLRLVTNRTEVLESRLNELFEKTDVLYLSVGDMNDLED 

25 VEVKLAHLPKSSSFHNLRVLnSECIELRYLFTLDVANTLSKLEHLQVYECDNMEEn 
HTEGRGEVTITFPKLKFLSLCGLPNLLGLCGNVHIINLPQLTELKLNGIPGFTSIYPEK 
DVETSSLLNKEVVIPNLEKLDISYMKDLKEIWPCELGMSQEVDVSTLRVIKVSSCDN 
LVNLFPCNPMPLIHHLEELQVIFCGSIEVLFNIELDSIGQIGEGINNSSLRnQLQNLGK 
IJSEWRKGADNSSLLISGFQGVESIIVNKCmFRNVFTPTTTNFDLGALMEIRIQDC 

30 GEKRRNNELVESSQEQEQ 

RG2L polynucleotide sequence (SEQ ID NO:112) 

GGA-AGACACAATGATGCAAAGACTGAAGAAGGTTGCCAAAGAAAATAGAA 
TGTTCAGTTACATGGTCGAGGCAGTTATAGGGGAAAAGACAGACCCAATT 

35 GCTATTCAACAAGCTGTAGCCGATTACCTTCGTATACAGTTCAAAGAAAG 
CACTAAACCAGCAAGAGCTGATAAGCTTCGTGAATGGTTCAAGGCCCACT 
CTGNAGACGGTAAGAATAAGTTCCTCGTAATATTTGATGACGTCTGGCAG 
TCCGTTGATCTGGAAGATATTGGNTTAAGTCCTTTTCCAAATCAAGGTGT 
CGACTTCAAGGTCTTGTTGACTTCACGAGACGAACACGTTTGCACAATGA 

40 TGGGGGTTGAAGCTAATTCAGTTATTAATGTGGGACTTCTAACTGAAGTA 
GAAGCACAAAGTCTGTTCCAGCAATTTGTAGAAACTTTTGAGCCCGAGCT 
CTGTAAGATAGGAGAAGTTATCGTAAGAAAGTGTTGCGGTCTACCTATTG 
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CCATCAAAACCATGGCGTGTACTCTAAGAAATAAAAGAAAGGATGCATGG 
AAGGATGCACTTTCACGTATAGAGCACTATGACATTCGTAGTGTTGCGCC 
TAAAGTCTTTGAAACAAGCTATCACAATCTCCAAGACAGGGAGACTAAAT 
CCGTGTTTTTGATGTGTGGTTTGTTTCCTGAAGACTTCAATATTCCTACC 

5 GAGGAGTTGATGAGGTATGGATGGGGCTTAAAGCTATTTGACAGAGTTTA 
TACAATTAGAGAAGCAAGAACCAGGCTCAACACCTGCATTGAGCGACTTG 
TGCAGACAAATTTGTTAATTGAAAGTGATGATGTTGGGTGTGTCAAGATG 
CATGATCTGGTGCGTGCTTTTGTTTTGGGTATGTATTCTGAAGTCGAGCA 
TGCTTCAATTGTCAACCATGGTAATATGCATGGGTGGACTAAAAATGATA 

10 TGAACGACTCTTGCAAAACAGTTTCTTTAACATGCGAGAGTGTGTCTGAG 
TTTCCAGGAGACCTCAAGTTTCCAAACCTAAAGCTTTTGAAACTTATGCA 
TGGAGATAAGATGCTAAGGTTTTCTCAAGACTTTTATGAAGGAATGGAAA 
AGCTCCAGGTAATATCATACCATAAAATGAAGTATCCATTGCTTCCCTCG 
TCACCTCAATGCTCCACCAACCTTCGAGTGCTTCATCTTCATCGGTGTTC 

15 ATTACGGATGCTTGATTGCTCTTGTATCGGAAATTTGACGAATCTGGAAG 
TGTTGAGCTTCGCTAATTCTGGCATTGAACGGATACCTTCAGCAATCGGA 
AATTTGAAGAAGCTTAGGCAACTTGATCTGAGAGGTCGTTATGGTCTTTG 
TATAGAACAGGGTGTCTTGAAAAATTTGGTCGAACTTGAAGAACTTTATA 
TTGGAAATGCATCTGCGTTTAGAGATTATAACTGCAATGAGATGGCAG 

20 

RG2L deduced polypeptide sequence (SEQ ID NO: 113) 

EDTMMQRLKKVAKENRMFSYMVEAVIGEKTDPIAIQQAVADYLRIQFKESTKPAR 

ADKLREWFKAHS7DGKNKFLVIFDDVWQSVDLEDIGLSPFPNQGVDFKVLLTSRDE 

HVCTMMGVEANSVINVGLLTEVEAQSLFQQFVETFEPELCKIGEVIVRKCCGLPIAI 

25 KTMACTLRNKRKDAWKDALSRIEHYDIRSVAPKVFETSYHNLQDRETKSVFLMCG 
LFPEDFNIPTEELMRYGWGLKLFDRVYTIREARTRLNTCIERLVQTNLLIESDDVGC 
VKNfflDLVRAFVLGMYSEVEHASIVNHGNMHGWTKNDMNDSCKTVSLTCESVSEF 
PGDLKFPNLKLLKLMHGDKMLRFSQDFYEGMEKLQVISYHKMKYPLLPSSPQCST 
NLRVLHLHRCSLRMLDCSCIGNLTNLEVLSFANSGIERIPSAIGNLKKLRQLDLRGR 

30 YGLCIEQGVLKNLVELEELYIGNASAFRDYNCNEMA 

RG2M polynucleotide sequence (SEQ ID NO:I14) 

GGGGAAGACACAATAGATGCAAAGGCTGAAGAAGTTGCCAAAGAAAAGAG 
AATGTTCAGTTATATCATTGAGGCGGTTATAGGGGAAAAGACAGACCCCA 

35 TTTCCATTCAGGAAGCTATATCATATTACCTTGGTGTAGAGCTCAATGCA 
AATACTAAGTCAGTAAGAGCTGATATGCTTCGTCAAGGGTTCAAGGCCAA 
ATCTGATGTAGGTAAGGATAAATTCTTAATAATACTCGACGATGTATGGC 
AGTCTGTTGATTTGGAAGATATTGGATTAAGTCCATTTCCAAATCAAGGT 
GTT.AACTTCAAGGTCCTGTTAACATCACGAGACCGACATATTTGCACTGT 

40 GATGGGGGTTGAAGGTCATTCGATTTTTAATGTGGGACTTCTCACAGAAG 
CAG.\ATCAAAAAGATTGTTCTGGCAGTTTGTAGAAGGTTCTGATCCTGAG 
CTCCATAAGATAGGAGAAGATATTGTAAGTAAGTGTTGTGGTCTACCCAT 



wo 98/30083 



149 



PCT/US98/00615 



TGCCATTAAAACCATGGCATGTACACTTAGAGATAAAAGTACGGATGCAT 
GGAAGGATGCACTGTCTCGTTTAGAGCATCATGACATTGAAAATGTTGCC 
TCTAAAGTTTTTAGAGCGAGCTATGACCATCTCCAAGACGAGGAGACTAA 
ATCCACTTTTTTTCTATGTGGATTGTTTCCAGAAGATTCCAATATTCCTA 

5 TGGAGGAGTTGGTGAGGTATGGGTGGGGATTGAAATTATTTAAAAAAGTG 
TATACCATAAGAGAAGCAAGAACTAGGCTCAACACTTGCATTGAGCGGCT 
CATCTATACCAATTTGTTGATAAAAGTTGATGATGTTCAGTGCATCAAGA 
TGCATGATCTCATCCGTTCTTTTGTTTTGGATATGTTTTCTAAAGTTGAG 
CATGCTTCGATTGTCAACCATGGTAATACGCTAGAGTGGCCTGCAGATNA 

10 TNTGCACGACTCTTGTAAAGGGCTTTCATTAACATGCAAGGGTANATGTG 
AGTTTTGTGGAGACCTNAANTTTCCAACCCTAATGATTTTAAAACTTATG 
CATGGAGATAAATCGCTAAGGTTT 

RG2M deduced polypeptide sequence (SEQ ID NO:115) 

15 GEDTIDAKAEEVAKEKRMFSYnEAVIGEKTDPISIQEAISYYLGVELNANTKSVRAD 
MLRQGFKAKSDVGKDKFLnLDDVWQSVDLEDIGLSPFPNQGVNFKVLLTSRDRHI 
CTVMGVEGHSIFNVGLLTEAESKRLFWQFVEGSDPELHKIGEDIVSKCCGLPIAIKT 
MACTLRDKSTDAWKDALSRLEHHDIENVASKVFRASYDHLQDEETKSTFFLCGLFP 
EDSNIPMEELVRYGWGUCLFKKVYTIREARTRLNTCIERUYTNLLIKVDDVQCIKM 

20 HDLIRSFVLDMFSKVEHASIVNHGNTLEWPAD??HDSCKGLSLTCKG?CEFCGDL?F 
PTLMILKLMHGDKSLRF 

RG2N polynucleotide sequence (SEQ ID N0:li6) 

AGGTAAAATCCATAACCCTAAATGTTGGTACGCTCATATATCAAATTGCG 

25 TGTTTTGTTGAATGAAAAAAGCATGCTCAAAAAACCAGTGTAAGGCACGG 
TATATGACATATTTATAGTTACTGATAACAAATTATGATAATTTTGGGTT 
TACRTAAGTTAGGATTCGTACTTCAACCAAATGTAATAGTTTTTGTGAGT 
CTATCTATGTATTTGGGGAATCACATTAGCAACGGGATTGTACTAGTAAT 
TCG.AAAAAGTCmTTAAATAATTTTTCTGTTTATAATTTATGAATAGTTT 

30 TAGCGACATCTAATATTAAATAGAATGTATCTGATATTGAATTAATGTCC 
TTAATGTGAACATAGACCTTTTCCATTTACTAATGCCTAATTATTAGTTT 
CTA.\TCAATAAATTTTAATTTCTGTTTTATGCTTCTAAGACAATAAAAAT 
CCATGATTTACCTTTAAATATTAACAAAAATGACCATAAATAAATAAAAA 
ATTAGGATACCAAACCCCCCCGCCATGCCCAATGTCTAAATATTCTTGAT 

35 GCTTrTGCTTTTCCCTCTTTTCCTTGTTAGTCTATTATTCTGGAGAGTTT 

GAGAGAGTTTCATACAAGAAAATTTCAAGAAGAAAGCAAAGGTCCAGGTA 
TTCTCTTTTCTTAATTATGTATTAACTTACAAGCATTTTTTACACGATCC 
ATGGTTTTTTGTGTATGTTTTTCAAATTGAAACTAGATTGGGACTTTTGC 
CCTTGATGATTCATAAGATATTGCATGGAGTTGAGATTGTGTAAGAAAAG 

40 TGGTGAATAGAAAGAGCAAGTGAATCCAGATATAGTATTGGTAATATATG 
ATGATGAGATAGAGATATGTTAAAACTGGCTAGAAAATTGTTTTAATTTG 
AAATTTAGGTKGTTGAATTTGAAAGATACCAAGCTAATAACTAATTAGTT 
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ATGCTAAWTAGTTATAAAGAACAACAAACTCTTAGTTTTTTTTTTCATGA 
TTTTCAACCTCTTTGTACCAAACTAAATTATAGCAAAATTGAATATCATT 
CTCTGCAATCAATCTTAACTTTTGTTATTATCATCATGTCTAAAATTGCC 
ACAAGTTTATTTTCAAAGTCATATTGGATTATGAAAGGACTATTTTTACC 

5 AATTACATCTTTACTTTATGGGCCAAAGCTAATACAATCCGACTAAACTA 
AAGGAATATGGGATGCATATAGTTTGCTTCCCGATTATAGATTTCTATCT 
AATTTGTCTATTGTACTAATTTAGGTGCCACCACAAGTAAATTTGTTAAA 
TGGATATCGTTAATGCCATTCTTAAACCAGTTGTCGAGACTCTCATGGTA 
CCCGTTAAGAAACACATAGGGTACCTCATTTCCTGCAGGCAATATATGAG 

10 GGAAATGGGTATCAAAATGAGGGGATTGAATGCTACTAGACTTGGTGTCG 
AAGAGCATGTGAACCGGAACATAAGCAACCAGCTTGAGGTTCCAGCCCAA 
GGCAGGGGTTGGTATGAAGAAGTAGGAAAGATCAATGCAAAAGTGGAAAA 
TTTTCCTAGCGATGTTGGCAGTTGTTTCAATCTTAAGGTTAGACACGGGG 
TCGGAAAGAGAGCCTCCAAGATAATTGAGGACATCGACAGTGTCATGAGA 

15 GAACACTCTATCATCATCTGGAATGATCATTCCATTCTTCTAGGAAGAAT 
TGATTCCACGAAAGCATCCACCTCAATACCATCAACCGATCATCATGATG 
AGTTCCAGTCAAGAGAGCAAACTTTCACAGAAGCACTAAACGCACTCGAT 
CCTAACCACAAATCCCACATGATAGCCTTATGGGGAATGGGCGGAGTGGG 
GAAGACGACAATGATGCATCGGCTGAAAAAGGTTGTGAAAGAAAAGAAAA 

20 TGTTTAATTTTATTGTTGAGGCGGTTGTAGGGGAAAAAACAGACCCCATT 
GCTATTCAATCAGCTGTGGCAGATTACCTAGGTATAGAGCTCAATGAAAA 
AACTAAACCAGCAAGAACTGAGAAGCTTCGTAAATGGTTTGTGGACAATT 
CTGCTGGTAAGAAGATCCTAGTCATACTCGACGATGTATGGCAGTTTGTA 
GATCTGAATGATATTGGTTTAAGTCCTTTACCAAATCAAGGTGTCGACTT 

25 CAAGGTGTTGTTGACATCACGAGACAAAGATGTTTGCACTGAGATGGGAG 
CTG.\AGTTAATTCAACTTTTAATGTGAAAATGTTAATAGAAACAGAAGCA 
CAA.\GTTTATTCCACCAATTTGTAGAAATTTCGGATGATGTTGATCGTGA 
GCTCCATAATATAGGAGTGAATATTGTAAGGAAGTGTGGCGGTCTACCCA 
TTGTCATCAAAACCATGGCGTGTACTCTTAGAGGAAAAAGCAAGGATGCA 

30 TGGAAGAATGCACTTCTTCGTTTAGTGAACTACAACATTGAAAATATAGT 
GAATGGAGTTTTTAAAATGAGTTACGACAATCTCCAAGATGAGGAGACTA 
AATCCACCTTTTTGCTTTGTGGAATGTTTCCCGAAGACTTTAATATTCCT 
ACCGAGGAGTTGGTGAGGTATGGATGGGGGTTGAAATTATTTAAAAAAGT 
GTATACTATAGGAGAAGCAAGAATCAGGCTCAACACATGCATTGAGCGGC 

35 TCATTCATACAAATTTGTTGATTGAAGTTGATGATGTTAGGTGCATCAAG 
ATGCATGATCTTGTCCGTGCTTTTGTTTTGGATATGTATTCTAAAGTCGA 
GCATGCTTCCATTGTCAACCATGGTAATACACTAGAGTGGCATGTGGATA 
ATATGCACAACTCTTGTAAAAGACTTTCATTAACATGCAAGGGTATGTCT 
AAGTTTCCTACAGACCTCAAGTTTCCAAACCTCTCGATTTTGAAACTTAT 

40 GCATGAAGATATATCATTGAGGTTTCCCAAAAACTTTTATGAAGAAATGG 
AGAAGCTTGAGGTTATATCCTATGATAAAATGAAATATCCATTGCTTCCC 
TCATCACCGCAATGCTCCGTCAACCTTTGCGTGTTTCATCTCCATAAATG 
CTCGTTAGTGATGTTTGACTGCTCTTGTATTGGAAATCTGTCGAATCTAG 
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AAGTGCTTAGCTTTGCTGATTCTGCCATTGACCTGTTGCCTTCCACAATC 

GGAATTTTGAAGAAGCTAAGGCTACTGGATTTGACAAATTGTTATGGTCT 

TTGTATAGCTAATGGTGTCTTTAAAAAATTGGTCAAACTTGAAGAGCTCT 

ATATGACAGTGGTTAATGGAGGAGTTCGAAAGGCGATCAGCCTCACTGAG 

GATAACTGCAATGAGATGGCAGAACGTTCAAAAGACCTTTCTGCATTAGA 

ACTTGAGTTCTTTGAAAACAATGCTCAGCCAAAGAATATGTCATTTGAGA 

AGCTACAACGATTCCAGATCTCAGTGGGGTGCTATTTATATGGAGCTTCC 

ATAAAGAGCAGGCACTCGTATGAAAACACATTGAAGTTGGTTATTGACAA 

AGGTGAATTATTTGAATCTTGAATGAACGGCCTGTTTAAGAAAACAGAGG 

TGTTATGTTTAAGTGTGGGAGATATGAATGATCTTGAAGATRTTGAGGTT 

AAGTCATCCTCACAACYTCTTCAATCTTCTTCGTTCAACAATTTAAGAGT 

CCTTGTCGTTTCAAAGTGTGCAGAGTTGAAACAC1TCTTCACACCTGGTG 

TTGCAAACACTTTAAAAAAGCTTGAGCATCTTGAAGTTTACAAATGTGAT 

AATATGGAAGAACTCATACGTAGCAGGGGTAGTGAAGAAGAGACGATTAC 

ATTCCCCAAGCTGAAGTTTTTATCTTTGTGTGGGCTACCAAAGCTATCGG 

GTTTGTGCGATAATGTCAAAATAATTGAGCTACCACAACTCATGGAGTTG 

GAACTTGACGACATTCCAGGTTTCACAAGCATATATCCCATGAAAAAGTT 

TGA.AACATTTAGTTTGTTGAAGGAAGAGGTAAATATAAATTTTTAATGCT 

AATACATTACAAAGGATCTTTTCAGTTAAATCTTTCAAAATATATTGTAA 

TTTGATTGTATGGGGTATTATTGTTGGATGGGACTATTAATAAATGATTA 

TCTTGCAGGTTCTGATTCCTAAGTTAGAGAAACTGCATGTTAGTAGTATG 

TGGAATCTGAAGGAGATATGGCCTTGCGAATTTAATATGAGTGAGGAAGT 

TAAGTTCAGAGAGATTAAAGTGAGTAACTGTGATAAGCTTGTGAATTTGT 

TTCCGCACAAGCCCATATCTCTGCTGCGTCATCTTGAAGAGCTTAAAGTC 

AAGAATTGTGGTTCCATTGAATCGTTATTCAACATCCATTTGGATTGTGC 

TGGTGCAACTGGAGATGAATACAACAACAGTGGTGTAAGAATTATTAAAG 

TGATCAGTTGTGATAAGCTTGTGAATCTCTTTCCACACAATCCCATGTCT 

ATACTGCATCATCTTGAAGAGCTTGAAGTCGAGAATTGTGGTTCCATTGA 

ATCGTTATTCAACATTGACTTGGATTGTGCTGGTGCAATTGGGCAAGAAG 

ACAACAGAAGCAGCTTAAGAAACATCAAAGTGGAGAATTTAGGGAAGCTA 

AGAGAGGTGTGGAGGATAAAAGGTGGAGATAACTCTCGTCCCCTTGTTCA 

TGGCTTTCAATCTGTTGAAAGCATAAGGGTTACAAAATGTAAGAGGTTTA 

GAAATGTATTCACACCTACCACCACAAATTTTAATCTGGGGGCACTTTTG 

GAGATTTCAATAGATGACTGCGGAGAAAACAGGGAAAATGACGAATCGGA 

AGAGAGTAGCCATGAGCAAGAGCAGGTAAGGATTTCAATTTCACTTTCKT 

ACrTAATTAATGATTAAGCTCCTGCTTTTTRAATAAAAAAGGGACAAACC 

ATTTCATGACTTAATGTAGCAATACAAGTCATGTATAAGAGTGACCAACT 

CTTTTTTATTTATAAAATGACTACAAAATATTTTTTTTCATTAGAGATCA 

TGTATAAATGTGACTAATTTTTCATCACCTAACTTTAGTTGATAAATCTT 

TATAAATGTCACTAGTTACTTTTCAGTAAAATAACAAATTTAATAAATTA 

TCAACAAAAAGCATCAACTAAAAAAATCCCACAACCCGTAATAATTTAAA 

ATAAAAGGATTTAACATCTAATACGAACAATTTTTTTTCTAAACATGATT 

TGGACCAAATATCACCAGCAACTCAAGTTTGGAATCGATTCAGCTTAAAA 
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CTTGACCARCATAATTAGATAGATGAGAGTTGAAGCTAAAGTGCCTATAT 

AAGTTCGTTTCATCTTTTTTCTTGATCTTGATAGCAAGTTGAATSATTTT 

CTTCTTCAAAATTGATAAAAATCTACATTATAAAGAGACTAGCTTGAAAA 

AAAATGGTCTAGGTGGGTCTTGGGTCTGGTAGATGAAGATGGAAGGGAGA 

GTAGATTTCAAAGACACAAACACATCTTCATTTTATTTATTTATTTATTA 

TTATTATTTTTTGATATCTTGCTCATATTTGTTACAGATATGTGAGGTCT 

ATTAATCTTTTTAAATATATAAAAAATAAATACATAAATGAGAAAATTAA 

ATAAAGAATAAATTAATAAGGGCACAATAGTCTTTTTTGGTAAGACAAGG 

ACCAAAAGCGCAACAAAAGTAAACAGTAGGGACCATCCGATTTAAAAAAT 

TAATTAGGGACCAAAAACATAAATTCCCCCAAACCATAGGGACCATTCGT 

GTAATTTACTCTTGCTTTTCGTTTTGTTCATATTTGGGTAACTATTTTTT 

TTGTACATATCTAGGTAACGAACTTGTTGAAAGTGTTCACATCTACGATG 

TGACCTACTACAACCGATCATAATGGTCATATATGAACACTTCCAACAAG 

TTTGTTATCTAGGTGTGTACAAAAAAACGATAGTTACCATGATGTGAACA 

TACCAAAAAATTAATrACCTTAGCAAGTTATTTTCCCATTTAGGTTGTAT 

GGAAACAGTTCCGTGAGACCGTGACTTGGATGGTAGATAAATTTAGTAAA 

CTTAACCCTTCAATTAACCTACCTTTTTCTTATTAACTCAATTTCAAGCT 

AAATTCTGATTCTTGTTTGAAAGTAAGTTGCATCTTTATGTTTGTATTAT 

CTTGTTGCATAGGATCCTTAGCATCTTTTAATAATTTATTTGAAGGTGAA 

AGATCCAACTATTTTTAATCTGTTGGCATTTTCCATCATTTGCAACTGTT 

TCTTGAAAAAAA::TACCTAAAATCAAAATAACCATTTTCATATCCAAAA 

TTATAAGAGAGAATTGTTAACGGACATGGAATCATAAATCATTAACACAG 

TTCAGTACACAGGTTGCTAATTACATTTCTTGCTGTGCAGATTGAAATTC 

TATCAGAGAAAGAGACATTACAAGAAGCCACTGGCAGTATTTCAAATATT 

GTATTCCCATCCTGTCTCATGCACTCTTTTCATAACCTCCATAAACTTAA 

CTTGAACAGAGTTGAAGGAGTGGAGGTGGTGTTTGAGATAGAGAGTGAGA 

GTCCAACAAGTAGAGAATTGGTAACAACTCACCATAACCAACAACAACCT 

ATTATACTTCCCAACCTCCAGGAATTGATTCTATGGAATATGGACAACAT 

GAGTCATGTGTGGAAGTGCGGCAACTGGAATAAATTCTTCACTCTTCCAA 

AAGAACAATCAGAATCCCCATTCCACAACCTCAGTAACATACATATTTAT 

GAATGCAAAAGCATTAAGTACTTGTTTTCACCTCTCATGGCAGAACTTCT 

TTCCAACCTAAAGCATATCGAGATAAGAGAGTGTGATGGTATTGAAGAAG 

TTGTTTCAAAAAGAGATGGTGAGGATGAAGACATGACTACATCTAC:::: 

:::GCACACAACCACCACTTTTTCCCTCATCTTGATTCTCTCACTCTAAA 

GCAACTGAAGAATCTGAAGTGTATTGGTGGAGGTGGTGCCAAGGATGAGG 

GGAGCAATGAAATATCTTTCAATAATACCACTGCAACTACTGCTGTTCTT 

GATCAATTTGAGGTATGCTTTGTACATATTCAATTATTTATTTAATTTCC 

TTGTTAATTTCCTTTTTT(mTGCAATATTCTATGAAAAAAATCACCAAA 

TCACAAATAAGAGATTTAAACTTTTATTTCACACCCATGCGGACTCAAGA 

ATGGGATTTGGAGGCATATAAAGTTACATTCATTTGAACAAGTATTACCA 

TTTATTTGTTATTTATCATTTTCATATCATTTACTGATAACATTTCTTTT 

TTACTTTTCTAATTAGAAAAGGTCCACATGTCTAATTAGGTTTTCCATTC 

TATGTGAATCCrCTATTCrGTCTGTAATCAAGCATCTTAGATTATTTATC 
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AATTGCAACCTGTCATATWTTMWWKKCWWWATKYWMWWARTAATACATTT 
TATACCCWCTATACTAAGATA 

5 RG2N deduced polypeptide sequence (SEQ ID NO:117) 

LGKTTMMHRIJCKVVKEKKMFNFIVEAVVGEKTDPIAIQSAVADYLGIELNEKTKPA 
RTEKLRKWFVDNSAGKKILVILDDVWQFVDLNDIGLSPLPNQGVDFKVLLTSRDKD 
VCTEMGAEVNSTFNVKMLIETEAQSLFHQFVEISDDVDRELHNIGVNIVRKCGGLPI 
XOKTMACTUIGKSKDAWKNALLRLVNYNIENIVNGVFKMSYDNLQDEETKSTFLL 
10 CGMFPEDFhnPTEELWYGWGUOJ'KKVYTIGEARIRLNTCffiRLIHTNLUEVDDVR 
CKAIHDLVRAFVUJMYSKVEHASrVWHGmi^WHVDNMHNSaail^LTCKGM 
FPTDLKFPNLSILKLMHEDISLRFPKNFYEEMEKLEVISYDKMKYPLLPSSPQCSVNL 
CVFHLHKCSLVMFDCSCIGNLSNLEVLSFADSAIDLLPSTIGILKKLRLLDLTNCYGL 
CIANGVFKKLVKLEELYMTVVNGGVRKAISL 

15 

RG20 polynucleotide sequence (SEQ ID N0:11S) 

TTGTAAAACGACGGCCAGTCGAATCGTAACCGTTCGTACGAGAATCGCTG 
TCCTCTCCTTCATTTGAATCATGATATTTGAATATCGATACrTTTGACTG 
TAGCTTTTGGGTCGATTTTTTAGCAAGATACATAACTGGCCAAACCCATT 
20 GGCTATTTTAGCCCAAAATATGAAATGGACTGGATTGTTTTTTTCCTTTC 
TAACACGCACACATCTGGCGATCAGTATCACTCCATTATGAAGACCTAGT 
CAAATTCATTAACGTTCAGTCGTTCCTTCAAAGTTTCAAAGTTCCAACTT 

TCTGCGACGAAGGAGAGCTTGGTCAGAGGGCTGTGATTCTTGAGTCTTGA 

25 CCTCCGAATCTAGCTGGATTATTTTCGACACACCAGACCACGTATCAGGT 
TGCTCATCCCGAAATACTGCTTTGCAAACTGTTGTATCATCGCCTAGGAA 
ATT.\AGTTTCTTTTTTGGCTCTGTTACTGAATCAGTAGCTTTGCAACTTG 
CTCATTATAAGCTGATCCATATTTTACATATCTTTTGAAGAATAATAGGT 
ACTGACTTTACCTTTCTGATGAGAGCGATTTAAGAGATACCTCTGTAAAA 

30 TCCATTTTTGTGAAGGGATCTGGGTTAGTTTTTAAAGGATTTGCTACAAC 
AGTATCCCACAAACGATCTATTTCCCATTTNACTCATCCGCTCAAGATCT 
ATCCACCTTTATATATGTTAATTGGGAGTCTTCCATGGTGCAATGAATCT 
AGGATGCATTTAGAAGCCCAATCCATTACAAGTTTTCATCCAATTTCATG 
TGACAAGTTGTTGGTTACTATGTAGGTACTTCCACAATTAAGAATTTCCA 

35 GCAATGGATGTTGTTAATGCCATTCTTAAACCAGTTGCCGAGACACTTAT 
GGAACCTGTTAAGAAACATCTAGGCTACATCATTTCCAGCACAAAACATG 
TGAGGGATATGAGTAACAAAATGAGGGAGTTGAACGCTGCAAGACATGCT 
GAAGAAGACCACTTGGACAGGAACATAAGAACTCGTCTTGAGATTTCAAA 
TCA.\GTTAGGAGTTGGTTAGAAGAAGTAGAAAAGATCGATGCAAAAGTAA 

40 AAGCCCTTCCTAGTGATGTCACCGCTTGTTGCAGTCTCAAGATCAAACAT 
GAAGTCGGAAGGGAAGCCTTGAAGCTAATTGTGGAGATTGAAAGTGCCAC 
AAGACAACACTCTTTGATCACCTGGACTGATCATCCCATTCCTCTGGGAA 
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AAGTTGATTCCATGAAGGCATCGATGTCCACAGCATCAACCGATTACAAT 
GACTTTCAGTCAAGAGAAAAAACTTTTACTCAAGCATTGAAAGCACTTGA 
ACCAAACAACGCTTCCCACATGATAGCGTTATGTGGGATGGGTGGAGTGG 
GGAAGACCACAATGATGCAAAGACTAAAAAAAGTTGCTAAACAAAATAGA 

5 ATGTTCAGTTATATGGTTGAGGCAGTTATAGGGGAAAAGACGGACCCAAT 
TGCTATTCAACAAGCTGTAGCGGATTACCTTCGTATAGAGTTAAAAGAAA 
GCACTAAACCAGCAAGAGCTGATAAGCTTCGTGAATGGTTCAAGGCCAAC 
TCTGGAGAAGGTAAGAATAAATTCCTTGTAATACTTGATGACGTCTGGCA 
GTCTGTTGATCTAGAAGATATTGGTTTAAGTCCTTTTCCAAATCAAGGTG 

10 TCGACTTCAAGGTCTTATTGACTTCACGAGACGAACATGTTTGCACAGTA 
ATGGGAGTTGGATCTAATTCAATTCTTAATGTGGGACTTCTAATAGAAGC 
AGAAGCACAAAGTTTGTTCCAACAATTTGTAGAAACTTCTGAGCCCGAGC 
TCCATAAGATAGGAGAAGATATTGTAAGGAAGTGTTGCGGTCTACCTATT 
GCCATCAAAACCATGGCATGTACTCTTAGAAATAAAAGAAAGGATGCTTG 

15 GAAGGATGCACTTTCGCGTATAGAGCACTATGACCTTCGCAATGTTGCGC 
CTA-AAGTCTTTGAAACGAGCTACCACAATCTCCATGACAAAGAGACTAAA 
TCAGTGTTTTTGATGTGTGGTTTGTTTCCGGAAGACTTCAATATTCCTAC 
TGAGGAGTTGATGAGGTATGGATGGGGATTAAAGATATTTGATAGAGTCT 
ATACATTTATAGAAGCAAGAAACAGGATCAACACCTGCATTGAGCGACTG 

20 GTGCAGACAAATTTGTTAATTGAAAGTGATGATGTTGGGTGTGTCAAGAT 
GCATGATCTGGTCCGTGCTTTTGTTTTAGGTATGTATTCTGAAGTAGAGC 
ATGCTTCAGTTGTCAACCATGGTAATATACCTGGATGGACTGAAAATGAT 
CCGACTGACTCTTGTAAAGCAATTTCATTAACATGCGAGAGTATGTCTGG 
AAACATTCCAGGAGACTTCAAGTTTCCAAACCTAACGATTTTGAAACTTA 

25 TGCATGGAGATAAGTCGCTAAGATTTCCACAAGACTTTTATGAAGGAATG 
GAAAAGCTCCAGGTTATATCATACGATAAAATGAAGTATCCAATGCTTCC 
CTTGTCTCCTCAATGCTCCACCAACCTTCGAGTGCTTCATCTCCATGAAT 
GTTCATTAAAGATGTTTGATTGCTCTTGTATTGGAAATATGGCGAATGTG 
GAAGTGTTGAGCTTTGCTAATTCTGGCATTGAAATGTTACCTTCCACTAT 

30 CGG.\AATTTAAAGAAGCTAAGGTTACTTGATTTAACAGATTGTCATGGTC 
TTCATATAACACACGGTGTCTTTAACAATTTGGTCAAACTTGAAGAGTTG 
TATATGGGATTTTCTGATCGACCTGATCAAACTCGTGGTAATATTAGCAT 
GACAGATGTCAGCTACAATGAATTAGCAGAACGTTCAAAAGGCCTTTCTG 
CATTAGAGTTCCAGTTCTTTGAAAACAATGCCCAACCAAATAATATGTCG 

35 TTTGGGAAACTTAAACGATTCAAGATCTCAATGGGATGCACTTTATATGG 
AGGATCAGATTACTTTAAGAAAACGTATGCTGTCCAAAACACATTGAAGT 
TGGTTACTAACAAAGGTGAACTATTGGACTCTAGAATGAACGAGTTGTTT 
GTTGAAACAGAAATGCTTTGTTTAAGTGTTGATGATATGAATGATCTTGG 
TGATGTTTGTGTGAAGTCCTCACGTTCTCCTCAACCTTCTGTGTTCAAAA 

40 TTCTAAGAGTCTTTGTCGTTTCCAAGTGTGTTGAGTTGAGATACCTTTTC 
ACA,\TTGGTGTAGCCAAGGATTTGTCAAATCTTGAGCATCTTGAAGTTGA 
TTCATGTAATAATATGGAACAACTCATATGTATTGAGAATGCTGGAAAAG 
AGACAATTACATTCCTAAAGCTGAAGATTTTATCTTTGAGTGGGCTACCA 
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AAGCTTTCGGGTTTGTGCCAAAATGTCAACAAACTTGAGCTACCACAACT 
CATAGAGTTGAAACTTAAGGGCATTCCAGGGTTCACATGCATTTATCCGC 
AAAACAAGTTGGAAACATCTAGTTTGTTGAAGGAAGAGGTAGATATATGT 
TTTATGTTAATACAAGTTAAAAAAT(nTrTTAACTAAAAGTTTCAGTATA 

5 TATATCTATATGTCTATAATTTGATTATATGATGTATTAGTGTTTGGATG 
TGGCTATTAAGGGATGATTATTTTGCAGGTTGTGATTCCTAAGTTGGAGA 
CACTTCAAATTGATGAGATGGAGAATTTAAAGGAAATATGGCATTATAAA 
GTTAGTAATGGTGAGAGAGTTAAGTTGAGAAAGATTGAAGTGAGTAACTG 
TGATAAGCTTGTGAATCTATTTCCACACAACCCCATGTCTCTGCTGCATC 

10 ATCTTGAAGAGCTTGAAGTCAAGAAATGTGGTTCCATTGAATCGTTATTC 
AACATCGACTTGGATTGTGTTGATGCCATAGGAGAAGAAGACAACATGAG 
GAGCTTAAGAAACATTAAAGTGAAGAATTCATGGAAGTTAAGAGAAGTGT 
GGTGTATAAAAGGTGAAAATAACTCTTGCCCCCrrGTTTCTGGCTTTCAA 
GCTGTTGAAAGCATAAGCATTGAAAGTTGTAAGAGGTTTAGAAATGTATT 

15 CACACCTACCACCACCAATTTTAATATGGGGGCACTTTTGGAGATATCAA 
TAGATGACTGTGGAGAATACATGGAAAATGAAAAATCGGAAAAGAGTAGC 
CAAGAGCAAGAGCAGGTATGGATTTCAATITCACTTTCTTACTTACTTAA 
GGATTAAGCTTCTGTTTTTTTGAATAAAAAAGGGACATCTTCTAATAATG 
CACATCTTAAATTAAAAAGTATTTAATTGTTGCATAGCAGCGTATAACAT 

20 CTTCTAATAATTTATCTGAAGGTGAAAGATCCAACTACTTCTAATTTGTT 
AAC.AATTTCAATCATTTGCAAATGTTCCTTAAAAAATTAATTACCTGAAA 
TCA.\AACAATCTTCTTCAAATCCAAAATTATGAGACAGAATTGAGAAGGG 
ATGTGAAATTATAAACCATTAACACAATTCCATGCTCACGTTACTAATTA 
CATTTCTTGTTGGGATATATATGTACAGACTGATATTTTGTCAGAGGAAG 

25 TGA-AATTACAAGAAGTCACTGATACTATTTCTAATGTTGTATTCACATCG 
TGTCTCATACACTCTTTTTATAACAACCTCCGTAAACTCAACTTGGAGAA 
GTATGGAGGAGTTGAGGTTGTGTTTGAGATAGAGAGTTCAACAAGTAGAG 
AATTGGTAACAACATACCATAAACAACAACAACAACAACAACCTATATTT 
CCC-AACCTTGAGGAATTATATCTATATTATATGGACAACATGAGTCATGT 

30 ATGGAAGTGCAACAACTGGAATAAATTTTTACAACAATCAGAATCCCCAT 
TCCACAACCTCACAACCATACACATGTCCGATTGCAAAAGCATTAAGTAC 
TTGTTTTCACCTCTCATGGCAGAACTTCTTTCCAACCTAAAGAGAATCAA 
TATTGACGAGTGTGATGGTATTGAAGAAATTGTTTCAAAAAGAGATGATG 
TGGATGAAGAA 

35 

RG20 deduced polypeptide sequence (SEQ ID NO:119) 

MD\-WAIIJCPVAETmEP\aaaiLGYnSSTKHVRDMSNmRELNAAMIAEEDHLD 

RNIRTRLEISNQVRSWLEEVEKIDAKVKALPSDVTACCSLKIKHEVGREALKLIVEIE 
SATRQHSLITWTDHPIPLGKVDSMKASMSTASTDYNDFQSREKTFTQALKALEPNN 
40 ASHMIALCGMGGVGKTTMMQRLKKVAKQNRMFSYMVEAVIGEKTDPIAIQQAVA 
DYUUELKESTKPARADKUIEWFKANSGEGKNKFLVILDDVWQSVDLEDIGLSPFP 
NQGVDFKVLLTSRDEHVCTVMGVGSNSILNVGLLIEAEAQSLFQQFVETSEPELHKI 
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GEDIVRKCCGLPIAIKTMACTLRNKRKDAWKDALSRIEHYDLRNVAPKVFETSYHN • 

LHDKETKSVFLMCGLFPEDFNIPTEELMRYGWGLKIFDRVYTFIEARNRINTCIERL 

VQTNLLIESDDVGCVKMHDLVRAFVLGMYSEVEHASVVNHGNIPGWTENDPTDSC 

KMSLTCESMSGNIPGDFm>NLTTIJa.NfflGDKSIJa^DFYEGMEiaQVISYD™ 

YPMU»LSPQCSTNUlVmiJIECSUCMFDCSaGNMAhrsnEVI^FANSGI^ 

LKKLRLLDLTDCHGLHITHGVFNNLVKLEELYMGFSDRPDQTRGNISMTDVSYNE 

LAERSKGLSALEFQFFENNAQPNNMSFGKLKRFKISMGCTLYGGSDYFKKTYAVQ 

NTLKLVTNKGELLDSRMNELFVETEMLCLSVDDMNDLGDVCVKSSRSPQPSVFKIL 

RVFWSKCVEUlYLFTIGVAKDl^NI^LEVDSCNNMEQUCffiNAGKETITFlJCm 

LSI^GUPKl^GLCQNVNKI^LPQLffiLKLKGIPGFTCIYPQNKLETSSLLKEEVVIPKL 

ETLQIDEMENLKEIWHYKVSNGERVKLRKIEVSNCDKLVNLFPHNPMSLLHHLEEL 

EVKKCGSmSLFNroLDCVDAIGEEDNMRSLRNKVKNSWKLREVWCIKGENNSCPL 

VSGFQAVESISIESCKRFRNVFrPTTTNFNMGALLEISIDDCGEYMENEKSEKSSQEQ 

EQTDII^EEVKLQEVTDTISNVVFrSCUHSFYNNLRKLNLEKYGGVEVVFEIESSTS 

RELVTTYHKQQQQQQPIFPNLEELYLYYMDNMSHVWKCNNWNKFLQQSESPFHN 

LTTIHMSDCKSIKYLFSPLMAELLSNLKRINIDECDGI 

RG2P polynucleotide sequence (SEQ ID NO: 120) 

CCCATTGCTATTCAGGAAGCAGTAGCAGATTACCTCNGTATAGAGCTCAA 

AGAAAAAACTAAATCNGCAAGAGCTGATATGCTTCGTAAAATGTTAGTTG 

CCAAGTCCGATGGTGGTAAAAATAAGTTCCTAGTAATACTTGACGATGTA 

TGGCAGTTTGTTGATTTAGAAGATATCGGTTTAAGTCCTTTGCCAAATCA 

AGGTGTTAACTTCAAGGTCTTGCTAACATCACGGGATGTAGATGTTTGCA 

CTATGATGGGAGTCGAAGCCAATTCAATTCTCAACATGAAAATCTTACTA 

GATGAAGAAGCACAAAGTTTGTTCATGGAGTTTGTACAAATTTCGAGTGA 

TGTTGATCCCAAGCTTCATAAGATAGGAGAAGATATTGTAAGAAAGTGTT 

GTGGTTTGCCTATTGCCATCAAAACCATGGCCCTTACTCTTAGAAATAAA 

AGCAAGGATGCATGGAGTGATGCACTTTCTCGTTTAGAGCATCATGACCT 

TCACAATTTTGTGAATGAAGTTTTTGGAATTAGCTACGACTATCTTCAAG 

ACCAGGAGACTAAATATATCTTTTTGCTTTGTGGATTGTTTCCCGAAGAC 

TACAATATTCCTCCTGAGGAGTTAATGAGGTATGGATGGGGCTTAAATTT 

ATTTAAAAAAGTGTATACTATAAGAGAAGCAAGAGCCAGACTCAACACCT 

GCATTGAGCGGCTTATCCATACCAATTTGTTGATGGAAGGAGATGTTGTT 

GGGTGTGTAAAGATGCATGATCTAGCACTTGCTTTTGTTATGGATATGTT 

TTCTAAAGTGCAGGATGCTTCAATTGTCAACCATGGTAGCATGTCAGGGT 

GGCCTGAAAATGATGTGAGTGGCTCTTGCCAAAGAATTTCATTAACATGC 

AAGGGTATGTCTGGGTTTCCTATAGACCTCAACTTTCCAAACCTCACAAT 

TTTAAAACTTATGCATGGAGATAAGTTTCTCAAGTTTCCTCCAGACTTTT 

ATGAACAAATGGAAAAGCTTCAAGTTGTATCGTTTCATGAAATGAAATAT 

CCGTTTCTTCCCTCGTCTCCTCAATATTGCTCCACCAACCTTCGAGTTCT 

TCATCTCCATCAATGCTCATTGATGTTTGATTGCTCTTGTATTGGAAATC 

TGTTTAATCTGGAAGTGTTGAGCTTTGCTAATTCTGGCATTGAATGGTTA 
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CCTTCCAGAATTGGAAATTTGAAGAAGCTAAGGCTACTAGATTTGACAGA 

TTGTTTTGGTCTTCGTATAGATAAGGGTGTCTTAAAAAATTTGGTCAAAC 

TTGAAGAGGTTTATATGAGAGTTGCTGTTCGAAGCAAAAAAGCCGGAAAT 

AGAAAAGCCATTAGCTTCACAGATGATAACTGCAATGAGATGGCAGAGCG 

TTC 



RG2P deduced polypeptide sequence (SEQ ID N0:121) 

PIAIQEAVADYL7IELKEKTKSARADMLRKMLVAKSDGGKNKFLVILDDVWQFVDL 

EDIGLSPLPNQGVNFKVLLTSRDVDVCTMMGVEANSILNMKILLDEEAQSLFMEFV 

QISSDVDPKLHKIGEDIVRKCCGmAJKTMALTUlNKSKDAWSDAl^M.EHHDLHN 

FVNEWGISYDYLQDQETKYIFLIX:GLFPEDYNIPPEEIJ4RYGWGIJ«JTa^^ 

ARARLNTCIERLIHTNLLMEGDWGCVKMHD]J\LAFVMDMFSKVQDASIVNM 

MSGWPENDVSGSCQRISLTCKGMSGFPIDLNFPNLTILKLMHGDKFLKFPPDFYEQ 

MEKLQVVSFHEMKYPFLPSSPQYCSTNLRVLHLHQCSLMFDCSCIGNLFNLEVLSF 

ANSGffiWLPSRIGNUOOJUiDLTDCFGLRIDKGVLKNLVKLEEVYMRVAVRSKKA 

GNRKAISFTDDNCNEMAERS 



RG2Q polynucleotide sequence (SEQ ID NO:122) 

TGGGGAAGACACAGTGATAGAAAARAAAAAGAATGTTGTGGAAAAGAGGA 

AAATGTTTGATTATGCTGTTGTGGCGGTTATAGGGGAAAAGACGGACCCT 

ATTGCTCTTCAGAAAACTGTTGCGGATTACTTGCATATTGAGCTAAATGA 

AAGCACTAAACTAGCAAGAGCAGATAAACTTTGCAAATGGTTCAAGGACA 

ACTCGGATGGAGGTAAGAAAAAGTTCCTCGTAATACTCGACGATGTTTGG 

CAATCTGTTGATTTGGAAGATATTGGTTTAAGTACTCCTTTTCCAAATCA 

AGGTGTCAACTTCAAGGTTTTGTTGACATCACGAAAGAGAGAAATTTGCA 

CAATGATGGGAGTTGAAGCTGATTTAATTCTCAATGTCAAAGTCTTAGAA 

GAAGAAGAAGCACAAAAGTTGTTCCTCCAGTTTGTAGAAATTGGTGACCA 

ATACCACGAGCTTCATCAGATAGGGGTACATATAGTAAAGAAGTGTTATG 

GTTTACCCATTGCCATTAAAACCATGGCTCTTACTTTAAGAAATAAAAGA 

AAGGATTCATGGAAGGACGCACTCTCTCGTTTAGAGGACCATGACACTGA 

AAATGTTGCAAATGCAGTTTTCGAGATGAACTACCGCAATCTACAAGATG 

AGGAGACCAAAGCCATTTTTTTGCTTTGCGGTTTGTTCCCCGAAGACTTT 

GATATTCCTACTGAGGAGTTGGTGAGGTATGGATGGGGCTTAAATCTATT 

TAA.AAAAGTGTATACCATAAGAAAGGCAAGAACGAGATCGCATACATGTA 

TTGAGCGACTCTTGGATTCAAATTTGTTGATTGAAAGTAACGATATTCGG 

TGCGTCAAGATACACGATCTGGTGCGCGCTTTTGTTTTGGATATGTATTG 

TAA.-\GTTGAGCATGCTTCAATTGTCAACCATGGTAATATGCGGACCGAAT 

ATA.\TATGGCTGACTCTTGCAAAACAATTTCATTAACATACAAGAGTATG 

TCTGGGTTTGAGTTTCCAGGAGACCTCAAGTTTCCAAACCTAACAGTTTT 

GAA.\CTTATGCANGGAGATAAGTCTCTAAGGTTTCCTCAAGACTTTTATC 

AATCAATGGAAAAACTTCGGGTTATATCATATGATAAAATGAAGTATCCA 

TTGCTTCCCTCATCACCTCAATGCTCCACTAACATCCGAGTGCTTCGTCT 
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CCATGAATGTTCATTAAGGATGTTTGATTGCTCTTGTATTGGAAAGCTAT 
TGA,\TTTGGAAGTCCTCAGCTTTTTTAATTCTAACATTGAATGGTTACCT 
TCCACAATCAGAAATTTAAAAAAGCTAAGGCTACTAGATTTGAGATATTG 
TGATCGTCTTCGTATAGAACAAGGTGTCTTGAAAAATTTGGTCAAACTTG 
5 AAGAACTTTATACTGGATATACATCAGCGTTTACAGA 

RG2Q deduced polypeptide sequence (SEQ ID NO:123) 

GEDTVIEKKKNVVEKRmFDYAWAVIGEKTDPIALQKTVADYLHIELNESTKLAR 
ADKLCKWFKDNSDGGKKKFLVILDDVWQSVDLEDIGLSTPFPNQGVNFKVLLTSR 

10 KREICTMMGVEADULNVKVLEEEEAQKLFLQFVEIGDQYHELHQIGVHIVKKCYG 
IJPIAIKTMALTIJlNKRKDSWKDAI^RI^HDTEhrVANAVFEMNYRNUJDEE^ 
FIXCGLFPEDFDIPTEELVRYGWGLNU^VYTIMCARTRSHTCIERLLDSNUJESN 
DIRCVKfflDLVRAFVLDMYCKVEHASIVNHGNMRTEYNMADSCKTISLTYKSMSG 
FEFPGDLKFPNLTVLKLM7GDKSLRFPQDFYQSMEKLRVISYDKMKYPLLPSSPQCS 

15 TNIRVLRLHECSLRMFDCSCIGKLLNLEVLSFFNSNIEWLPSTIRNLKKLRLLDLRYC 
DRLRIEQGVLKNLVKLEELYTGYTSAFTE 

RG2S polynucleotide sequence (SEQ ID NO:124) 

ATTTGGGGTTTTACATTTAATTTTTTGTGCATGAATGTGAAAATAGACTG 

20 CTTATTGATTCTTTGTGTTTCATTGAGTTGATTTTCATTATTACTACCTT 
ACAAATTGCTCAGTGATAGATTTCCATTAATTTGCTAATTCGGTTGCTTC 
TAA.\TATGTAGGAGCTACTAAAAGCAAAAATATCGAGCAATGTCGGACCC 
AACGGGGATTGCTGGTGCCATTATTAACCCAATTGCTCAGAGGGCCTTGG 
TTCCCGTTACAGACCATGTAGGCTACATGATTTCCTGCAGAAAATATGTG 

25 AGGGTCATGCAGACGAAAATGACAGAGTTGAATACCTCAAGAATCAGTGT 
AGAGGAACACATTAGCCGGAACACAAGAAATCATCTTCAGATTCCATCTC 
AAATTAAGGATTGGTTGGACCAAGTAGAAGGGATCAGAGCAAATGTGGAA 
AACTTTCCGATTGATGTCATCACTTGTTGTAGTCTCAGGATCAGGCACAA 
GCTTGGACAGAAAGCCTTCAAGATAACTGAGCAGATTGAAAGTCTAACAA 

30 GACAGCTCTCCCTGATCAGTTGGACTGATGATCCAGTTCCTCTAGGAAGA 
GTTGGTTCCATGAATGCATCCACCTCTGCATCATCAAGTGATGATTTCCC 
ATC.\AGAGAGAAAACTTTTACACAAGCACTAAAAGCACTCGAACCCAACC 
AACAATTCCACATGGTAGCCTTGTGTGGGATGGGTGGAGTAGGGAAGACT 
AGAATGATGCAAAGGCTGAAGAAGGCCGCTGAAGAAAAGAAATTGTTTAA 

35 TTATATTGTTAGGGCAGTTATAGGGGAAAAGACGGACCCCTTTGCCATTC 
AAGAAGCTATAGCAGATTACCTCGGTATACAACTCAATGAAAAAACTAAG 
CCAGCAAGAGCTGATAAGCTTCGTGAATGGTTCAAAAAGAATTCAGATGG 
AGGTAAGACTAAGTTCCTCATAGTACTTGACGATGTTTGGCAATTAGTTG 
ATCTTGAAGATATTGGGTTAAGTCCTTTTCCAAATCAAGGTGTCGACTTC 

40 AAGGTCTTGTTGACATCACGAGACTCACAAGTTTGCACTATGATGGGGGT 
TGAAGCTAATTCAATTATTAACGTGGGCCTTCTAACTGAAGCAGAAGCTC 
AAAGTCTGTTCCAGCAATTTGTAGAAACTTCTGAGCCCGAGCTCCAGAAG 
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ATAGGAGAGGATATCGTAAGGAAGTGTTGCGGTCTACCTATTGCCATAAA 
AACCATGGCATGTACTCTTAGAAATAAAAGAAAGGATGCATGGAAGGATG 
CACTTTCGCGCATAGAGCACTATGACATTCACAATGTTGCGCCCAAAGTC 
TTTGAAACGAGCTACCACAATCTCCAAGAAGAGGAGACTAAATCCACTTT 
5 TTTAATGTGTGGTTTGTTTCCCGAAGACTTCGATATTCCTACTGAGGAGT 
TGATGAGGTATGGATGGGGCTTGAAGCTATTTGATAGAGTTTATACGATT 
AGAGAAGCAAGAACCAGGCTCAACACCTGCATTGAGCGACTGGTGCAGAC 
AAATTTGTTAATTGAAAGTGATGATGTTGGGTGTGTCAAGATGCATGATC 
TGGTCCGTGCTTTTGTTTTGGGTATGTTTTCTGAAGTCGAGCATGCTTCT 

10 ATTGTCAACCATGGTAATATGCCCGAGTGGACTGAAAATGATATAACTGA 
CTCTTGCAAAAGAATTTCATTAACATGCAAGAGTATGTCTAAGTTTCCAG 
GAGATTTCAAGTTTCCAAACCTAATGATTTTGAAACTTATGCATGGAGAT 
AAGTCGCTAAGGTTTCCTCAAGACrrTTATGAAGGAATGGAAAAGCTCCA 
TGTTATATCATACGATAAAATGAAGTACCCATTGCTTCCTTTGGCACCTC 

15 GATGCTCCACCAACATTCGGGTGCTTCATCTCACTAAATGTTCATTAAAG 
ATGTTTGATTGCTCTTGTATTGGAAATCTATCGAATCTGGAAGTGCTGAG 
CTTTGCTAATTCTCGCATTGAATGGTTACCTTCCACAGTCAGAAATTTAA 
AGAAGCTAAGGTTACTTGATCTGAGATTTTGTGATGGTCTCCGTATAGAA 
CAGGGTGTCTTGAAAAGTTTAGTCAAACTTGAAGAATTTTATATTGGAAA 

20 TGCATCTGGGTTTATAGATGATAACTGCAATGAGATGGCAGAGCGTTCTG 
ACA-\CCTTTCTGCATTAGAATTCGCGTTCTTTAATAACAAGGCTGAAGTG 
AAAAATATGTCATTTGAGAATCTTGAACGATTCAAGATCTCAGTGGGACG 
CTCTTTTGATGGAAATATCAATATGAGTAGCCACTCATACGAAAACATGT 
TGC.AATTGGTGACCAACAAAGGTGATGTATTAGACTCTAAACTTAATGGG 

25 TTATTTTTGAAAACAAAGGTGCTTTTTTTAAGTGTGCATGGCATGAATGA 
TCTTGAAGATGTTGAGGTGAAGTCGACACATCCTACTCAGTCCTCTTCAT 
TCTGCAATTTAAAAGTTCTTATTATTTCAAAGTGTGTAGAGTTGAGATAC 
CTTTTCAAACTCAATCTTGCAAACACTTTGTCAAGACTTGAGCATCTAGA 
AGTTTGTGAATGCGAGAATATGGAAGAACTCATACATACTGGAATTTGTG 

30 GAGAAGAGACAATTACTTTCCCTAAGCTGAAGTTTTTATCTTTGAGTCAA 
CTACCGAAGTTATCAAGTTTGTGCCATAATGTCAACATAATTGGGCTACC 
ACATCTCGTAGACTTGATACTTAAGGGCATTCCAGGTTTCACAGTCATTT 
ATCCGCAGAACAAGTTGCGAACATCTAGTTTGTTGAAGGAAGAGGTAGAT 
ATATGTTCTTTATGTTAATACAATTTAAATAATATTTTCAACCAAATTTT 

35 CAT.AATATATCTGTAATTTGATTGTATGATGTGTTATTGTTTATATGTGG 
CTATTAAGGGATGATTATTTTGCAGGTTGTGATTCCTAAGTTGGAGACAC 
TTC.\AATTGATGACATGGAGAACTTAGAAGAAATATGGCCTTGTGAACTT 
AGTGGAGGTGAGAAAGTTAAGTTGAGAGAGATTAAAGTGAGTAGCTGTGA 
TAAGCTTGTGAATCTATTTCCGCGCAATCCCATGTCTCTGTTGCATCATC 

40 TTG.\AGAGCTTAAAGTCAAGAATTGCGGTTCCATTGAATCGTTATTCAAC 
ATTGACTTGGATTGTGTCGGTGCAATTGGAGAAGAAGACAACAAGAGCCT 
CTT.AAGAAGCATCAACATGGAGAATTTAGGGAAGCTAAGAGAGGTGTGGA 
GGATAAAAGGTGCAGATAACTCTCATCTCATCAACGGTTTTCAAGCTGTT 
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GAAAGCATAAAGATTGAAAAATGTAAGAGGTTTAGCAATATATTCACACC 
TATCACCGCCAATTTTTATCTGGTGGCACTTTTGGAGATTCAGATAGAAG 
GTTGCGGAGGAAATCACGAATCAGAAGAGCAGGTAACGCTTTCAATTTAA 
CTTTCTTAAGTAATTAAGGACTAACCrCCTGTTTTTTGAATAATAAAGAG 
5 GTGGGATGACTAAACTTGGGCATCACAATTGCAACAAAATGTTACAAACC 
ATGAAACGTTCAAACCATTTCTTGAATTAAGGTTTCAATACAAGTCATTT 
AAAAATATGGCTTAAATTTTTTTATATTTATGTATCAACATGATTTTTCA 
TTAGAGATCATTATTATAATAGTAAGTTTAAAGCAATTTAAATTAGAACT 
AATTCTAACTTTAGCTAATAAATCGTTATAAATGTAAATAATTACTTTTT 

10 AGTGAAATAAGCAACGGATTTAATAAGTTAACAACTTAAATGTCATTTCC 
TAACAAAAAAAACTATTTGGTTCAGAAGAACCGTAATTCAAGATAACTAA 
AATAAAAATATTTGACATTCACTAAGAGCATTTTTTTTTCTAAATATGAT 
TGCAAATGAATAAAACTTAAATTTATACAGAAAAGATTTTTATATATGTT 
ATACAAAATTTACAAATTGAAACTGGATATGTTAATTAACGGTTTATAAT 

15 TCTGGTATCACAAAGGGATATATAATAAAATATTATTTTCTGTAGTCATT 
TAT.AATTGTACTAGTTTATAACCCGTGGGAACCATGAGTTCTAAAATTAG 
TTAAACTTTCATAATAAAAATTTATAATTATTATTTATTTTAAATAAATT 
ATT.AATTAAGAGATGTATCAAAAATTTAAAGTTATTATAACTTCAAATTT 
AACATATAATTAGAAAATATATGATCATAACTTTCCGCAACTCTTCTTTT 

20 GTATTAAAATGCCCAGAGAAGCTCTTAGTAYATTTTCTAAATCAAAGTCA 
CAA.AACTAATGAAGCATATAATTTTGTGAAAATCAATTAGCATTAGGTTT 
TAAGAGTCACCAAATTCAAAGAGTAATCCAATGCTTTCATTACCACTATG 
GAG-\AAATATTTTCTTAGTTTAAATGAAATGAAAACAAACATTCAAACTA 
ATTGTTGCTTACTAAACCAAAGACCCATTACTTAGCCAAGAGTTTAACCA 

25 AAA-\AAATTACATTCATGTATCATTATTCATGACTAGATATATATGAACA 
TGA.-\GGGAGTTTTTATAGAAAATATAATCATAGATATTCAACATAACTTC 
ATGGAATTCCTCAAAATAACCAAGTTATTCAAGAAATTACATCCAAGTCA 
ACC.\AAGAGAAGTTTAGCCTAGCATGGCTAAACTCAAGAAAATAAAATAA 
GGATTAGAAGTACCAAACATGTAGTAAGAATCACAGTAAAAGATGATGTT 

30 GTTCTTGATGTTCTTCTAAGTTCTTCAAGTCTCCAGTTGCTCCTAATAAT 
GCAAAGGAGAGCCATTAAATTCGTATGTATTGATCCCTTCAAAAGCTGCA 
CCA.\CCTCCCTTAAATAACACTCAAAGCAAAAATGACAAAATTGCCCCTG 
AAGGACCCTATGCGGGTGCCTTGCGCGGGTGGAGCTGAATACGAAAGGTC 
TTTGGTCTTTGTGAGGGTGATGCTGTGCGGGTTAGCTTGTCGCATGCTTC 

35 CGCGCGGTTCGCGCACATGTGCACAAGTGATGCATGGTGTGTACGTTCTT 
GAGTTTTGAGCCTCCGATGCTTAGTCCATTTGGCCCAATTCGAGTCCAAT 
CAGCTTATGACCCATTTTTCTTCAAGTTATCTTCAAGTTATCTTCAAGTT 
AAGCCCAAATTGCCTTCTCCAAATCATCCATAACTTCACAAAATCGCCCG 
TTCATCTTAATCCCGAATGCACAATTATTCTCCTGTCTTCCTTTTAAGCA 

40 AGATACCACCTTCTTCATGCTTCATCCATCAATAGTACACTTCATGTATC 
ATCTCTACTAGTTATTTAGTCCACAATCCTTATTGTCCTCCAAATTTAAT 
TATCTCATTTAGTTCCCGTTCCACTAGTTTCCTTAAAATTTGCAATTAAG 
CTCACACAAATATTAAGTACCTGAAATGGTCATAAAATAACAAAAAGGAA 
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AATATGCATGAAGATTAACTAAATGATGAACGAAATATGCTAAAATAGAC 
. TATAAAATGAAGTAAATAAAATGAAATTATCGCACTCCGACCACCCTTAT 
AGGCTTGTAGTCCATCCACCCTTCATTCCTTGTACCAATATGGGATGGAA 
ACATCATTAATTAAGCCAAAAAACTAACATATAAGGGGTGAGTGACAAAG 
5 GTA-\GTACTAAAGATGAAAATAATCCATTTTTYTTGTATATACACAACAC 
ACACATAGGGGCAGACGTAGGATTTCATAGTACAGATTGTTGGTGGCACA 
TAAGTGTTGCTGGTGACACTTTTTTTTTCTTTTACGTAGTGGCACAACAG 
TAG.\AAAAACGARAAATTCGAAATTTTTTACAATGTGTSTAAAAAAAAYA 
GTGGTTGTTGGTGCCACTATGGACACCAAAGTTGAACTGCCCCTGCGCGC 

10 RCACACACACACACATAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAG 
ARAGWAWGRRRGAKAKARMCSMSYTTGGGATGTGATACTTCTTTTAGGAA 
AATGGAGTTATATCTTTGATATTGTATTTTTTTAATGTAATTTATATATT 
TAATCATTTTAGTTTATAAGTTTTATTTATTTTGATATGAAAAAAAAAGT 
CTTTTATACATTGGATTTAACATAAAAATCCAACAATATTAATCAAAAAG 

15 ACC.\MACATGTGGACAMWTATGTATATAAWTAATTCACAATAGTCTTTAG 
GAATAGNATTATATATATAATTAATTCTCAATGGTCTTAGGAATAGTAAG 
TTCTTATATTTCAAACTTTNGCCACAATTCTTTGKTTACTTWGACACTTY 
CCTCTCTCTAATTATATATATATATATATATATATATATATATATACACA 
CACACACACACACACTAGATGTGTGCCCGCGCAAAGCAGTGACGTNNNGG 

20 AGA.\NACTTTCTTAAGCATAAATAATTATTATATTTTTTATTGGGTATTA 
TAT.AATAAAAAATTACAACTTTTAAATAAAATATTTATGTTTATACTTTA 
TATTTATATTGCnTGTATACTATTAATATAATAAATTAATATTTATGTCT 
AATTTATGAAATGTAAATTAATTTAAATACATGAATTTAATATTTTTAAA 
ATTTTCAGTTTGCTTCAAATTGAGTTTCTTAATTATTTTTTTTAATTCAN 

25 GTATTCAAACTTTTGGTAAGTATTAAAGAATTATTTATGCACAATTGATT 
TATACAAAAAACTTTGTAACTTATACATCTTAAAATTCAAGATATAACTA 
ACATGTTTTACAATATATATATATATANATATATATATATATATATATAT 
ATATATATATATATATAGTAAAGCGCANAGGTCATAGGNANAGANTATTT 
TCTATTATTCTACGTTTTGCCACAAAAGTTTGAACACTTTGCCACTTTTT 

30 GTCCCTCCTTAACCTTTTCAATGTTTTGCGACAAAAGTTCCAAAACTTTG 
CCACTTTGATCATTCCTCAACTTTTCACCGCATTAGTTTGTGGAGTTGGC 
AGTTTTGGTCCCCCTAACTTCGATATTTTCTCCTGCTAGCCAAAAAGGGT 
TCCAGAGTTTCACANTTTTGGTCCCTGACAATAACCAAATGTGAGATGTC 
AAATTTTTGCCACATTAGTTTGTGGAGTTGTCCCTTTTGGTCCCCCCACA 

35 TTCGATATTCTACTATACGACCTTATTTTTCTCAAATAACAACACGTATA 
TTT.\ATTACCAATGATAGAAATAGATATCAAATAAAGTATTTGTAACACC 
GTGTAAGAACGGTGCTACTATAGGTAAAAATAAACATTTCAAAGTACGAT 
GTCCTAATTGGAAAAAGAGTTTTAAAAAAATAACAACTAGGGGCGAGTTT 
TTTTTACAAGTTTGTATCAAATCATATCAAAATTTAAGGTGGAACGGTGA 

40 CCACATTAACCAGAAATGTAATTTATTCTTTGATTTTGATAATTTTTAAT 
ATTTTGTTGTGATCTATGTATTTAAAAGTAAACAACAAAGAACATAATCC 
AAAACCCTAAATTGCAAGTCTCGCCCAATTTCTCTATCACTAGTCGTCAC 
TTACGATGGCGTTACGTCGCTCTCTCACTTCTTACAACCCTTTGTTGCTA 
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CTCATTACAATAACGAAAAGTTGAATATCCATATATTTATTTGGATGTGG 

AATTGAACAAATCTCGTCAAATTTTTGATTTTGTTGATGGATTTGAGTAG 

AAGTTTGGGCAGAACGGGAATGATGGTCTGCAAGTGGTTATAAACTTGAT 

TCTGAGTTATTACTATATATGTAGCCTCirrACAACGACCAAGGTTTCTT 

CCAGGTACCATTTGATCTTTTTAGAACCCAGTTGTCTGAAACACCCTGAT 

TTGGATCAAATATCACCAACAACTCTTAAAAACTTGATTAATCAATTGTT 

TTCTTCATCTTGATAACAAGTGGAATGATTTTCTACTTAGATTAACTTGA 

AAAAAAAGGTCCATGTGCGTCTGGTGGATCTGGTAAATGAAGATGGAAGG 

GAGAGCTGACTTTAAAGACACAAACACGTCACCATATCTTTTATTTTATT 

TTAAATTrGCTTTTTTCCTATTTCTTTCTTTCTTGATCTCCAGATGGTAT 

GTGGTGTGGATAATTTACACATAGAGATTGGGAACGACTGTGTTTTAGAG 

AGGACGTGGCTTGGGGTTGAGGATGGTTTATGGCTGGCCGAGTTTCATTT 

ATATAAACAAACAAATATATAAAACAAGGGGTAAAATGGCCATCTTATAT 

GTATTTAACCGTCCTTTTTTATTTTTTTm 

GGTATACCAGTGTCAGCCTCTTATTCCCAACCAGGCAACCAGTCAAATAG 

GGACTTAGGTTGTTTGGAAACAGTTCCGTGAGACCGTGACTTGGATGGTA 

GAT.AAATTTAGTAAACTTAACCCTTCAATTAACCTACCTTTTTCTTATTA 

ACTCAATTTCAACCTAAATTCTGATTCTTGTTTGAAAATAAGTTGCATCT 

TTATGTTTGTATTATCCTGTTGCATAGGATCCTTAGCATCTTTTAATAAT 

TTATTTGAAGGTGAAAGATCCAACTATTTTTTAGCTGTTGGCATTTTCCA 

TCATTTGCAACTGTTTCTTGAAAAAAAAATACCTAAAATCAAAATAACCA 

TTTTCAAATCCAAAATTATAAGAGAGAATTGTTAATGGACGTGGAATCGT 

AAATCATTAACACAGTTCAGTACACAAGTTGCTAATTACATTTCTTGCTG 

TGCAGATTGAAATTCTATCAGAGAAAGAGACATTACAAGAAGTCACTGAT 

ACT.AATATTTCTAATGATGTTGTATTATTCCCATCCTGTCTCATGCACTC 

TTTTCATAACCTCCATAAACTTAAATTGGAGAGAGTTAAAGGAGTGGAGG 

TGGTGTTTGAGATAGAGAGTGAGAGTCCAACAAGTAGAGAATTGGTAACA 

ACTCACCATAACCAACAACATCCTATTATACTTCCCAACCTCCAGGAATT 

GGATCTAAGTTTTATGGACAACATGAGTCATGTGTGGAAGTGCAGCAACT 

GGA-\TAAATTCTTCACTCTTCCAAAACAACAATCAGAATCCCCATTCCAC 

AACCTCACAACCATACACATGTTCAGCTGCAGAAGCATTAAGTACTTGTT 

TTCGCCTCTCATGGCAGAACTTCTTTCCAACCTAAAGGATATCTGGATAA 

GTGGGTGTAATGGTATTAAAGAAGTTGTTTCAAAGAGAGATGATGAGGAT 

GAAGAAATGACTACATTTACATCTACCCACACAACCACCATCTTGTTCCC 

TCATCTTGATTCTCTCACTCTAAGACTACTGGAGAATCTGAAGTGTATTG 

GTGGAGGTGGTGCCAAGGATGAGGGGAGCAATGAAATATCTTTCAATAAT 

ACCACTGCAACTACTGCTGTTCTTGATCAATTTGAGGTATGCTTTGTACA 

TATTCAATTATTTATTTAATTTCCTTTTTTCTTTGCAATATTCTATAAAT 

AATACATTTTATACCCACTATACTAAGATAATAATTACCTAGAGGGATGG 

ATGCTATGACACAGCTGCTACACTTCAGAAACTCTAGTAAGGGCAGTTAT 

GGA-^GTTCAATAAAATGATAATGGCATCTTTTGATGGGTAATATAGGCAA 

TTT.ViGTTTTATTTCTGTTAAAGCAGTATTTAGCAAGTACTGGCCAGTAG 

GAGAGGAGAATATCACCTTTTGTGAAAATCTGGTCATTGTACCCAAGAAT 
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TTAGTTAAATGTAACATTTTAGATATCAGGGGACATCAGGTGACAGATAT 
TGTAGAATAGAACAATATATAATATTACCCAAAACTATTTTTTCTAAGGT 
TATTCTGTTAAATATGTGCTTTCTTGATTTCATTGAATTTGCATTCCTAT 
ATTTTAGGTGGTAAAGTGATTGTCTCTTCAATAAATCCCGAAATTAATTA 

5 AAA-AAAAAAAAAACAAAAGTAAATTTTTGATATGGAGAGCACTGGTATCA 
TTTAGTATATAAAAAAACTAGATTTTGAATTAAGTTTCTTATATAAAAGC 
TGTGTATATAGTTTAATTAGTTTTACATCATTTTTCCATGTGGTGTTGCA 
GTTGTCTGAAGCAGGTGGTGTTTCTTGGAGTTTATGCCAATACGCTAGAG 
AGATAGAGATATCTAAGTGTAATGTATTGTCAAGTGTGATTCCATGTTAT 

10 GCAGCAGGACAAATGCAAAAGCTTCAAGTGCTGAGAGTAACGGGTTGTGA 
TGGCATGAAGGAGGTATTTGAAACTCAATTAGGGACGAGCAGCAACAAAA 
ACAGAAAGGGTGGTGGTGATGAAGGAAATGGTGGAATTCCAAGAGTAAAT 
AAC.\ATGTTATTATGCTTCCCAATCTAAAGACATTGAAAATCTACATGTG 
CGGGGGTTTGGAACATATATTCACATTCTCTGCACTTGAAAGCCTGACAC 

15 AGCTCCAAGAGTTAAAGATAGTGGGTTGCTACGGAATGAAAGTGATTGTG 
AAG.AAGGAAGAAGATGAATATGGAGAGCAGCAAACAACAACAACAACAAC 
AACGAAGGGGGCATCTTCTTCTTCTTCTTCTTCTTCTTCTAAGAAGGTTG 
TGGTCTTTCCCCGTCTAAAGTCCATTGAACTATTCAATCTACCAGAGCTG 
GTAGGATTCTTCTTGGGGATGAATGAGTTCCGGTTGCCTTCATTGGAAGA 

20 AGTTACCATCAAGTATTGCTCAAAAATGATGGTGTTTGCAGCTGGTGGGT 
CCACAGCTCCCCAACTCAAGTATATACACACAAGATTAGGCAAACATACT 
CTTGATCAAGAATCTGGCCTTAACTTTCATCAGGTATATATATATTCCTT 
TAATTGGCATGATCTAATTAAGAAAGATATCATTCCTGCCAAGTAAATTT 
ACTTCAAACACATTCACACTGGTTTCAGTCTAAGTTTATGTTGTTCTAGG 

25 AAGGCCAAAATGGGAAAGCAAGATAGGGAAAAATAGTGTATTTCAGTGGA 
AAGGGTATTTTAGGTATTTTCTGTCAAAAGTTGTTATTGCAGGCTTTTTA 
GTACCTGGAATCGTGTGTGGGAGGAGCGTTATTATTCTGATTTGCTTGTT 

TTTTGATTTTAAATGACAAAATTTTTCCCTGTTACTCTATTTGATTGTTG 
30 TTCTTCATGGTTCTAAGTGAGTTATTGGCTCATCTGTTACTTCTTTTGAT 
TGTTATTTTCATATCATGTTGTCCmGAATCAAGCTTTTCCATTTTCAA 
CCAGGGCAAAAGGTCAAAAGTAACCTACTTTATGAGATCAAAAACAGCAA 
CCCATCGGATAACTTTTAGTTGGAGTTAATAGTTACAATTACCATTGTGA 
TTA.ATAATTATAATATCTTGTATTAATTCATTAAAATTGGTACAGCACAT 
35 ATATGACATTTTAAAGGTTTGTTTTTGTTWGACATATATATGCCTCTGGC 
GTTTTCTTTATTGGACATGCAGACCTCATTCCAAAGTTTATACGGTGACA 
CCTCGGGCCCTGCTACTTCAGAAGGGACAACTTGGTCTTTTCATAACTTG 
ATCGAATTAGATATGGAATTAAATTATGATGTTAAAAAGATTATTCCATC 
CAGTGAGTTGCTGCAACTGCAAAAGCTGGAAAAGATTCATGTGAGTAGTT 
40 GTTATTGGGTAGAGGAGGTATTTGAAACTGCATTGGAAGCAGCAGGGAGA 
AATGGAAATAGTGGAATTGGTTTTGATGAATCGTCACAAACTACTACTAC 
TACTACTCTTTTCAATCTTCGAAACCTCAGAGAAATGAAGTTGCATTTTC 
TACGTGGTCTGAGGTATATATGGAAGAGCAATCAGTGGACAGCATTTGAG 
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TTTCCAAACCTAACAAGAGTTCATATAAGTAGGTGTAGAAGGTTAGAACA 
TGTATTTACTAGTTCCATGGTTGGTAGTCTATTGCAACTCCAAGAGCTAG 
ATATTAGTTGGTGCAACCATATGGAGGAGGTGATTGTTAAGGATGCAGAT 
GTTTCTGTTGAAGAAGACAAAGAGAGAGAATCTGATGGCAAGACGAATAA 

5 GGAGATACTTGTGTTACCTCGTCTAAAATCCTTGAAATTAAAATGCCTTC 
CATGTCTTAAGGGGTTTAGCTTGGGGAAGGAGGATTITTCATTCCCATTA 
TTGGATACTTTAGAAATCTACAAATGCCCAGCAATAACGACCTTCACCAA 
GGGAAATTCTGCTACTCCACAGCTAAAAGAAATAGAAACAAGATTTGGCT 
CGTTTTATGCAGGGGAAGACATCAACTCCTCTATTATAAAAAGATCAAAC 

10 AACAGGTAAATCAGATCTTTGTTGCTTTAATAATTCTTAAACTACATTTG 
AAAAGCTTCATGCAAGTTTTTTTTGTTATATTGTCAAAAACCGCAACCTA 
CATTTTCAGCTTTATATTTATGTACTTTATGCAGGAGTTCAAACAAAACT 
CTGATTAATGTGAAGTGAATATTAAAGGTAAATTATATTTTCATGTTCCT 
AGTTGCCTATTAATTAATGGCCTTTTAGTTCRTGATTTTTGGATGTAGTY 

15 WTCATGATGATGTGAATCTTCTAATACCCCATTCATTGTTTGGTTGAATG 
TTGACTCTATGTCAGGATGAATATTCAAGGGAAGAATTGTTCATCATATG 
AAGGACATTAAAGAACATGGATGCTATGAAGATGTTGGAARAC 

RG2S deduced polypeptide sequence (SEQ ID NO:125) 

20 MSDPTGIAGAnNPIAQRALVPVTDHVGYMISCRKYVRVMQTKMTELNTSRISVEEH 
ISRNTIWHLQIPSQKDWLDQWGIRANVENFPIDVITCCSIJURHKLGQKAFKITEQI 
ESLTRQLSLISWTDDPVPLGRVGSMNASTSASSSDDFPSREKTFTQALKALEPNQQF 
HMN'ALCGMGGVGKTRMMQRLKKAAEEKKLFNYIVRAVIGEKTDPFAIQEAIADYL 
GIQLNEKTKPARADKlJiEWFKKNSDGGKTKFLIVLDDVWQLVDLEDIGLSPFPNQG 

25 VDFKVLLTSRDSQVCTMMGVEANSIINVGLLTEAEAQSLFQQFVETSEPELQKIGED 
IVRKCCGLPIAIKTMACTLRNKRKDAWKDALSRIEHYDIHNVAPKVFETSYHNLQE 
EETKSTFLMCGLFPEDFDIPTEELMRYGWGLKLFDRVYTIREARTRLNTCIERLVQT 
NLUESDDVGCVKMHDLVRAFVLGMFSEVEHASIVNHGNMPEWTENDITDSCKRIS 
LTCKSMSKFPGDFiCFPNLMIUCLMHGDKSLRFPQDFYEGMEKLHVISYDKMKYPLL 

30 PLAPRCSTNIRVLHLTKCSLKMFDCSCIGNLSNLEVLSFANSRffiWLPSTVRNLKKLR 
LLDLRFCDGLRIEQGVLKSLVKLEEFYIGNASGFIDDNCNEMAERSDNLSALEFAFF 
NNKAEVKNMSFENLERFKISVGRSFDGNINMSSHSYENMLQLVTNKGDVLDSKLN 
GLFLKTKVLFLSVHGMNDLEDVEVKSTHPTQSSSFCNLKVLnSKCVELRYLFKLNL 
ANTLSRLEHLEVCECENMEELlHTGICGEETITFPKLKFLSLSQLPKLSSLCHNVNnG 

35 LPHLVDLILKGIPGFTVIYPQNKLRTSSLLKEEWIPKLETLQIDDMENLEEIWPCELS 
GGEKVKLREIKVSSCDKLVNLFPRNPMSLLHHLEELKVKNCGSIESLFNIDLDCVGA 
IGEEDNKSLLRSINMENLGKLREVWRKGADNSHLINGFQAVESIKIEKCKRFSNIFT 
PITANFYLVALLEIQIEGCGGNHESEEQIEILSEKETLQEVTDTNISNDVVLFPSCLMH 
SFHNLHKLKLERVKGVEVVFEIESESPTSRELVTTHHNQQHPIILPNLQELDLSFMD 

40 NMSHVWKCSNWNKFFTLPKQQSESPFHNLTTIHMFSCRSIKYLFSPLMAELLSNLK 
DI\V1SGCNGIKEVVSKRDDEDEEMTTFTSTHTTTILFPHLDSLTLRLLENLKCIGGGG 
AKDEGSNEISFNNTTATTAVLDQFELSEAGGVSWSLCQYAREimSKCNVLSSVIPCY 
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AAGQMQKLQVLRVTGCDGMKEVFETQLGTSSNKNRKGGGDEGNGGIPRVNNNVI- 
MU'NLKTLKIYMCGGIJEHIFrFSALESLTQLQELKIVGCYGMKVIVKKEEDEYGEQ 
Q'nTrrrrKGASSSSSSSSSKKVVVFPRLKSIELFNLPELVGFFLGMNEFRLPSLEEVT 
IKYCSmMVFAAGGSTAPQIXYIHTRLGKHTLDQESGLNFHQTSFQSLYGDTSGPA 
5 TSEGTTWSFHNLIELDMELNYDVKKIIPSSELLQLQKLEKIHVSSCYWVEEVFETAL 
EAAGRNGNSGIGFDESSQTTTTTTLFNLRNLREMKLHFLRGLRYIWKSNQWTAFEF 
PNLTRVfflSRCRRLEHVFTSSMVGSLLQLQELDISWCNHMEEVIVKDADVSVEEDK 
ERESDGKTNmLVU>RIJCSLKIiCCLPCUCGFSLGKEDFSFPIXDTI^YKCPAITTF^ 
KGNSATPQLKEffiTRFGSFifAGEDINSSIIKRSNNRSS^«CTUNVK.ILK 

10 

RG2T polynucleotide sequence (SEQ ID NO:126) 

GGA.\GACGACAATGGTGCAACGGTTGAAGAAGGTTGTGAAAGATAAGAAG 

ATGTTCCATTATATTGTCGAGGTGGTTGTAGGGGCAAACACTGACCCCAT 

TGCTATCCAGGATACTGTTGCAGATTACCTCAGCATAGAACTGAAAGGAA 

15 ATACGAGAGATGCAAGGGCTTATAAGCTTCGTGAATGCTTTAAGGCCCTC 
TCTGGTGGAGGTAAGATGAAGTTCCTAGTAATTCTTGACGATGTATGGAG 
CCCTGTTGATCTGGATGATATCGGTTTAAGTTCTTTGCCAAATCAAGGTG 
TTGACTTCAAGGTCTTGCTGACATCACGCAACAGTGATATCTGCATGATG 
ATGGGAGCTAGTTTAATTTTCAACCTCAATATGTTAACAGACGAGGAAGC 

20 ACATAATTTTTTCCGTCGATACGCAGAAATTTCTTATGATGCTGATCCCG 
AGCTTATTAAGATAGGAGAAGCTATTGTAGAGAAATGTGGTGGTTTACCC 
ATTGCCATCAAAACTATGGCCGTTACTCTTAGAAATAAACGCAAAGATGC 
ATGGAAAGATGCACTTTCTCGTTTAGAGCACCGTGACACTCATAATGTTG 
TGGCTGATGTTCTTAAATTGAGCTACAGCAATATCCAAGACGAGGAGACT 

25 CGGTCGATTTTTTTGCTATGTGGTTTGTTTCCTGAAGACTTTGATATTCC 

TACCGAAGACTTAGTGAGGTATGGATGGGGATTGAAAATATTTACCAGAG 
TGTATACTATGAGACATGCAAGAAAAAGGTTGGACACGTGCATTGAGCGG 
CTTATGCATGCCAACATGTTGATAAAAAGTGATAATGTTGGATTTGTCAA 
GATGCATGATCTGGTTCGTGCTTTTGTTTTGGGCATGTTATCTGAAGTCG 

30 AGCATGCATCAATTGTCAACCATGGGGATATGCCAGGGTGGTTTGAAACT 
GCA.AATGATAAGAACAGCTTGTGCAAAAGAATTTCATTAACATGCAAAGG 
TATGTCTGCGATTCCTGAAGACCTCACGTTTCCAAACCTCTCGATCCTGA 
AATrAATGGATGGAGACGAGTCACTGAGGTTTCCTGAAGGCTTTTATGGA 
GAAATGGAAAACCTTCAGGTTATATCATATGATAACATGAAGCAGCCATT 

35 TCTTCCACAATCACTTCAATGCTCCAATGTTCGAGTGCTTCATCTCCATC 
ACTGCTCATTAATGTTTGATTGCTCTTCTATTGGAAATCTTTTGAATCTC 
GAGGTGCTCAGCATTGCTAATTCTGCCATTAAATTGTTACCCTCCACTAT 
TGGAGATCTGAAGAAGCTAAGGCTCCTGGATTTGACAAATTGTGTTGGTC 
TCTGTATAGCTAATGGCGTCTTTAGAAATTTGGTCAAACTTGAAGAGCTT 

40 TATATGAGAGTTGATGATCGAGATTCGTTTTTTGTGAAAGCTGATGACAG 
CAAGACCATTACCT 
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KTmVQRLKKVVKDKKMFHYIVEVVVGANTDPIAIQDTVADYLSffiLKGNTRDAR 
AYKLRECFKALSGGGKMKFLVILDDVWSPVDLDDIGLSSLPNQGVDFKVLLTSRNS 
DICMMMGASLIFNLNMLTDEEAHNFFRRYAEISYDADFEUKIGEAIVEKCGGLFIAI 

5 KTMAVTUlNKRKDAWKDAI^M^HRDTHNWADVUa^YSNIQDEETRSIFI^ 
LFPEDFDIPTEDLVRYGWGUamVYTMRHARKRLDTCffiRU^LHANMUKSDNVG 
FVKMHDLVRAFVLGMLSEVEHASrVNHGDMPGWFETANDKNSLCKRISLTCKGMS 
AIPEDLTFPNLSILKLMDGDESLRFPEGFYGEMENLQVISYDNMKQPFLPQSLQCSN 
WVLHUIHCSIJ4FDCSSIGNLLNI^VI^IANSAIKUJ>STIGDLKK1JU.LDLTNCVGL 

10 C]ANGVFRhn.VKI^LYMRVDDRDSFFVKADDSKTIT 

RG2U polynucleotide sequence (SEQ ID NO:128) 

GCCTTGTGTGGGATGGGTGGAGTGGGAAAGACCACTGTGATGAAGAAGCT 
GAAGGAGGTTGTGGTAGGAAAGAAACTGTTTAATCATTATGTTGAGGCGG 

15 TTATAGGGGAAAAGACAGACCCCATTGCTATTCAACAAGCTGTTGCCGAG 
TACCTTGGTATAAGTCTAACCGAAACCACTAAACCAGCAAGAACTGATAA 
GCTCCGTACATGGTTTGCAAACAACTCAAATGGAGGAAAGAAGAAGTTCC 
TGGTAATACTAGACGATGTATGGCAACCAGTTGATTTGGAAGATATTGGT 
TTAAGTCGTTTTCCAAATCAAGATGTTGACTTCAAGGTCTTGATTACATC 

20 ACGGGACCAATCAGTTTGCACTGAGATGGGAGTTAAAGCTGATTTAGTTC 
TCAAGGTGAGTGTCCTGGAGGAAGCGGAAGCACACAGTTTGTTCCTCCAA 
TTTTTAGAACCTTCTGATGATGTCGATCCTGAGCTCAATAAAATCGGAGA 
AGAAATTGTAAAGAAGTGTTGCAGACTACCCATTGCTATCAAAACCATGG 
CCTGAACTCTTAGAAGTAAAAGTAAGGATACATGGAAGAATGCCCTTTCT 

25 CGTTTACAACACCATGACATTAACACAATTGCGTCTACTGTTTTCCAAAC 
TAGCTATGACAATCTCGAAGACGAGGTGACTAAAGCTACTTTTTTGCTTT 
GTGGTTTATTTCCGGAGGACTTCAATATTCCTACCGAGGACCTATTGAGG 
TATGGATGGGGATTGAAGTTATTCAAGGAAGTAGATACTATACGAGAAGC 
AAGATCCAAGTTGAAAGCCTGCATTGAGCGGCTCATGCATACCAATTTGT 

30 TGATCGAAGGTGATGATGTTAGGTACGTTAAGATGCATGATCTGGTGCGT 
GCTTTTGTTTTGGATATGTTTTCTAAAGCCGAGCATGCATCTATTGTCAA 
CCATGGTAGTAGTAAGCCAAGGTGGCCTGAAACTGAAAGTGATGTGAGCT 
CCTCTTGCAAAAGAATTTCATTAACATGCAAGGGTNTG 

35 RG2U deduced polypeptide sequence (SEQ ID NO:129) 

ALCGMGGVGKTTVMKKLKEVVVGKKLFNHYVEAVIGEKTDPIAIQQAVAEYLGIS 
LTETTKPARTDKLRTWFANNSNGGKKKFLVILDDVWQPVDLEDIGLSRFPNQDVD 
FK\TJTSRDQSVCTEMGVKADLVLKVSVLEEAEAHSLFLQFLEPSDDVDPELNKIGE 
EIVKKCCRLPIAIKTMA.TLRSKSKDTWKNALSRLQHHDINTIASTVFQTSYDNLEDE 
40 VTKATFLLCGLFPEDFNIPTEDLLRYGWGLKLFKEVDTIREARSKLKACIERLMHTN 
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LLIEGDDVRYVKMHDLVRAFVLDMFSKAEHASIVNHGSSKPRWPETESDVSSSCKR. 
ISLTCKG? 



RG2V polynucleotide sequence (SEQ ID NO:130) 

5 CTGTGGAAGACACGAATGATSAAGAAGCTGAAGGAGGTCGTGGAACAAAA 
GAAAATGTTCAATATTATTGTTCAAGTGGTCATAGGAGAGAAGACAAACC 
CTATTGCTATTCAGCAAGCTGTAGCAGATTACCTCTCTATTGAGCTGAAA 
GAAAACACTAAAGAAGCAAGAGCTGATAAGCTTCGTNAATGGTTCGAGGA 
CGATGGAGGAAAGAATAAGTTCCTTGTAATACTTGATGATGTATGGCAGT 

10 TTGTCGATCTTGAAGATATTGGTTTAAGTCCTCTGCCAAATAAAGGTGTC 
AACTTCAAGGTCTTGTTGACGTTAAGAGATTCACATGTTTGCACTCTGAT 
GGGAGCTGAAGCCAATTCAATTCTCAATATAAAAGTTTTAAAAGATGTTN 
AAGGACAAAGTTTGTTCCGCCAGTTTGCTAAAAATGCAGGTGATGATGAC 
CTGGATCCTGCTTTCAATGGGATAGCAGATAGTATTGCAAGTAGATGTCA 

15 AGGTTTGCCCATTGCCATCAAAACCATTGCCTTAAGTCTTAAAGGTAGAA 
GCAAGCCTGCGTGGGACCATGCGCTTTCTCGTTTGGAGAACCATAAGATT 
GGTAGTGAAGAAGTTGTGCGTGAAGTTTTTAAAATTAGCTATGACAATCT 
CCAAGATGAGGTTACTAAATCTATTTTTWTACTTTGTGCTTTATTTCCTG 
AAGATTTTGATATTCCTATTGAGGAGTTGGTGAGGTATGGGTGGGGCTTG 

20 AAATTATTTATAGAAGCAAAAACTATAAGAGAAGCAAGAAACAGGCTCAA 
CACCTGCACTGAGCGGCTTAGGGAGACAAATTTGTTATTTGGAAGTGATG 
ACATTGGATGCGTCAAGATGCACGATGTGGTGCGTGATTTTGTTTGGTAT 
ATATTCTCAGAAGTCCAGCACGCTTCAATTGTCAACCATGGTAATGTGTC 
AGAGTGGCTAGAGGAAAATCATAGCATCTACTCTTGTAAAAGAATTTCAT 

25 TAACATGCAAGGGTATGTCTGAGTTTCCCAAAGACCTCAAATTTCCAAAC 
CrXTCAATTTTGAAACTrATGCATGGAGATAAGTCGNTGAGCTTTCCTGA 
AGACTTTTATGGAAAGATGGAAAAGGTTCAGGTAATATCATATGATAAAT 
TGATGTATCCATTGCTTCCCTCATCACTTGAATGCTCCACTAACGTTCGA 
GTGCTTCATCTCCATTATTGTTCATTAAGGATGTTTGATTGCTCTTCAAT 

30 TGGTAATCTTCTCAACATGGAAGTGCTCAGCTTTGCTAATTCTAACATTG 
AATGGTTACCATCTACAATTGGAAATTTGAAGAAGCTAAGGCTACTAGAT 
TTGACAAATTGTAAAGGTCTTCGTATAGATAATGGTGTCTTAAAAAATTT 
GGTCAAACTTGAAGAGCTTTATATGGGTGTTAATGTCCGTATGGACCAGG 
CCGT 

35 

RG2y deduced polypeptide sequence (SEQ ID NO:131) 

LWKTRM?KKLKEVVEQKKMFNnVQVVIGEKTNPIAIQQAVADYLSIELKENTKEAR 
ADKLR7WFEDDGGKNKFLVILDDVWQFVDLEDIGLSPLPNKGVNFKVLLTLRDSH 
VCTLMGAEANSILNIKVLKDV7GQSLFRQFAKNAGDDDLDPAFNG1ADSIASRCQGL 
40 PIAIKTIALSLKGRSKPAWDHALSRLENHKIGSEEVVREVFKISYDNLQDEVTKSIF7L 
CALFPEDFDIPIEELVRYGWGLKLnEAKTIREARNRLNTCTERLRETNLLFGSDDIG 



wo 98/30083 



168 



PCT/US98/00615 



CVKMHDWRDFVWYIFSEVQHASIVNHGNVSEWLEENHSIYSCKRISLTCKGMSEF- 
PKDlJOTNIi;iLB!lMHGDKS?SFPEDFYGKMEKVQVISYDKIJVIYPLLPSSLECSTNV 
RVUIIiIYCSlJlMFDCSSIGNLIJ^ME\a^FANSMEWLPSTIGNLKKLRLLDLTNCKG 
LRIDNGVLKNLVKLEELYMGVNVRMDQAV 
5 t 

RG2W polynucleotide sequence (SEQ ID NO:132) 

TTGGGAAAGAGACAATGATGAAGAATTGAAAGAGGTTGTGGTTGAAAAGA 

AAATGTTTAATCATTATGTGGAGGCGGTTATAGGGGAGAAGACGGACCCC 

ATTGCTATTCAGCAAGCCGTTGCAGAGTACCTTGGTATAATTCTAACAGA 

10 AACCACTAAGGCAGCAAGAACCGATAAGCTACGTGCATGGCTTTCTGACA 
ATTCAGATGGAGGAAGAAAGAAGTTCCTAGTAATACTAGACGATGTATGG 
CATCCGGTTGATATGGAAGATATTGGTTTAAGTCGTTTCCCAAATCAAGG 
TGTCGACTTCAAGGTCTTGATTACATCACGGGACCAAGCTGTTTGCACTG 
AGATGGGAGTTAAAGCTGATTCAGTTATCAAGGTGAGTGTCCTAGAGGAA 

15 GCTGAAGCACAAAGCTTATTCTGCCAACTTTGGGAACCTTCTGATGATGT 
CGATCCTGAGCTCCATCAGATTGGAGAAGAAATTGTAAGGAAGTGTTGTG 
GTTTACCCATTGCAATAAAAACCATGGCCTGCACTCTTAGAAGTAAAAGC 
AAGGATACATGGAAGAATGCACTTTCTCGTTTACAACACCATGACATTAA 
CACAGTCGCGCCTACTGTTTTTCAAACCAGCTATGACAATCTCCAAGATG 

20 AGGTGACTGGAGATACTTTTTTGCTATGTGGTTTGTTTCCGGAGGACTTC 
GATATTCCTACTGAAGACTTATTGAAGTATGGATGGGGCTTAAAATTATT 
CAAGGGAGTGGATTCTGTAAGAGAAGCAAGATACCAGTTGAACGCCTGCA 
TTGAGCGGCTCGTGCATACCAATTTGTTGATTGAAAGTGATGTTGTTGGG 
TGCGTCAAGTTGCACGATCTGGTGCGTGCTTTTATTTTGGATATGTTTTG 

25 TAAAGCGGAGCATGCTTCGATTGTCAACCATGGTAGTAGTAAGCCTGGGT 
GGCCTGAAACTGAAAATGATGTGATCAGGACCTCCTGCAAAAGAATCTCA 
TTA.ACATGCAAGGGTATGATTGAGTTTTCTAGTGACCTCAAGTTTCCAAA 
TGTCTTGATTTTAAAACTTATGCATGGAGATAAGTCGCTAAGGTTT 

30 RG2W deduced polypeptide sequence (SEQ ID NO:133) 

WERDNDEELKEWVEKKMFNHYVEAVIGEKTDPIAIQQAVAEYLGIILTETTKAAR 
TDKLRAWLSDNSDGGRKKFLVILDDVWHPVDMEDIGLSRFPNQGVDFKVLITSRD 
QA\'CTEMGVKADSVIKVSVLEEAEAQSLFCQLWEPSDDVDPELHQIGEEIVRKCCG 
LPIAIKTMACTLRSKSKDTWKNALSRLQHHDINTVAPTVFQTSYDNLQDEVTGDTF 
35 LLCGLFPEDFDIPTEDLLKYGWGLKLFKGVDSVREARYQLNACIERLVHTNLLIESD 
WGCVm^DLVRAFILDMFCKAEHASIV^raGSSKPGWPETENDVIRTSCKRISLTCK 
GMIEFSSDLKFPNVLILKLMHGDKSLRF 

RG5 polynucleotide sequence (SEQ ID NO:134) 

40 GGGGGGGTGGGGAAGNCGAGTCTAGCCCAGAAGNTCTATAATGACCATAA 
AAT.\AAAGGAAGCTTTAGTAAACAAGCATGGATCTGTGTTTCTCAACAAT 
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ATTCTGATATTTCAGTTTTGAAAGAAGTCCTTCGGAACATCGGTGTTGAT 
TATAAGCATGATGAAACTGTTGGAGAACTTAGCAGAAGGCTTGCAATAGC 
TGTCGAAAATGCAAGTTTCTTTCTTGTGTTGGATGATATTTGGCAACATG 
AGGTGTGGACTAATTTACTCAGAGCCCCATTAAACACTGCAGCTACAGGA 
5 ATAATTCTAGTAACAACTCGTAATGATACAGTTGCACGAGCAATTGGGGT 
GGAAGATATTCATCGAGTAGAATTGATGTCAGATGAAGTAGGATGGAAAT 
TGCTTTTGAAGAGTATGAACATTAGCAAAGAAAGTGAAGTAGAAAACCTA 
CGAGTTTTAGGGGTTGACATTGTTCGTTTGTGTGGTGGCCTCCCCCTAGC 
CTT 

10 

RG5 deduced polypeptide sequence (SEQ ID NO:135) 

GGVGKTTLAQK7YNDHKIKGSFSKQAWICVSQQYSDISVLKEVLRNIGVDYKHDET 
VGEl^WUAIAVENASFFLVUJDIWQHEVWTNLLRAPLNTAATGIILVTTRNDTVA 
RAIGVEDmVELMSDEVGWKLLLKSMNISKESEVENLRVLGVDIVRLCGGLPLAL 

15 

RG7 polynucleotide sequence (SEQ ID NO:136) 

GGTGGGGTTGGGAAGACAACGGGCACAAGGAGGCGACTGCCAATACTTCC 

GACTTTTATTCATAGAGATGACGAGTCTTATTTTCCTACTACTATAGGGA 

GGATATTTGGTTGCGCGAGACGATTCATTGCGCGAAGGGATTCTATCCTT 

20 CTTTTTTTCCGCGAAGACTTCGTTCCGGAGGACGGGCTATATTCCCTTTA 
ATATTAGTCTAGCCCAGTCTAGGCCAACCATATGGCGATGCGGTAGACCT 
CCCAGAGATAGATACTTGATCTTAGAGGATTCACACGTTCAATGGTGGAA 
ACTTAAGGAACCGGCTAAGAGTGACTAAACGGAAAAACCCTATTCATTCC 
ATAGCCTCATCCGGTCGAGGCATTAAACAATCCATCCCAATCCTCTTTCC 

25 TTTGGTCTACTCTAATGATGTGCCCGTTCGTTGGTGGAATATCTCTTTAT 
ACCGACGATTTATATGGGGATTGCCACTAGCGTTG 

The above examples are provided to illustrate the invention but not to limit 
its scope. Other variants of the invention will be readily apparent to one of ordinary skill 
30 in the art and are encompassed by the appended claims. All publications, patents, and 
patent applications cited herein are hereby incorporated by reference. 
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WHAT Tg CLAMgPIS: 

1 . An isolated nucleic acid constract comprising an RG polynucleotide which encodes 
an RG polypeptide having at least 60% sequence identity to an RG polypeptide from an 
RG family selected from the group consisting of: an RGl polypeptide, an RG2 
polypeptide, an RG3 polypeptide, an RG4 polypeptide, an RG5 polypeptide, and an RG7 
polypeptide. 

2. The nucleic acid construct of claim 1 , wherein the RG polynucleotide encodes an 
RG polypeptide comprising an leucine rich region (LRR). 

3. The nucleic acid construct of claim 1, wherein the RG polynucleotide encodes an 
RG polypeptide comprising a nucleotide binding site (NBS). 

4. The nucleic acid construct of claim 1, wherein the polynucleotide is a full length 
gene. 

5. The nucleic acid construct of claim 1 . wherein the further encodes a fusion protem. 

6. The nucleic acid construct of claim 1 . wherein the RGl polypeptide is encoded by 
an RGl polynucleotide sequence. 

7. The nucleic acid construct of claim 6, wherein the RGl polypeptide is encoded by a 
polynucleotide sequence selected from the group consisting of SEQ ID N0:1 (RGl A), 
SEQ ID NO:2 (RGIB), SEQ ID NO: 3 (RGIC), SEQ ID N0:4 (RGID), SEQ ID N0:5 
(RGIE), SEQ ID N0:6 (RGIF), SEQ ID N0:7 (RGIG). SEQ ID NO:8 (RGIH), SEQ ID 
NO:9 (RGII), and SEQ ID NO: 10 (RGl J). 

8. The nucleic acid construct of claun 1, wherein the RG2 polypeptide is encoded by 
an RG2 polynucleotide sequence. 

9. The nucleic acid construct of claim 8, wherein the RG2 polypeptide is encoded by a 
polynucleotide sequence selected from the group consisting of: SEQ ID NO:21 (RG2A); 
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SEQ ID NO:23 (RG2B); SEQ ID NO:25 (RG2Q; SEQ ID NO:27 (RG2D); SEQ ID 
NO:29 (RG2E); SEQ ID N0:31 (RG2F); SEQ ID NO:33 (RG2G); SEQ ID NO:35 
(RG2H); SEQ ID NO:37 (RG2I); SEQ ID NO:39 (RG2J); SEQ ID N0:41 (RG2K); SEQ 
ID NO:43 (RG2L); SEQ ID NO:45 (RG2M); SEQ ID NO:87 (RG2A); SEQ ID NO:89 

5 (RG2B); SEQ ID N0:91 (RG2C); SEQ ID NO:93 (RG2D) and SEQ ID NO:94 (RG2D); 
SEQ ID NO:96 ( RG2E); SEQ ID NO:98 (RG2F); SEQ ID NO: 100 (RG2G); SEQ ID 
NO:102 (RG2H); SEQ ID NO:104 (RG2I); SEQ ID NO:106 (RG2J) and SEQ ID NO:107 
(RG2J); SEQ ID NO: 109 (RG2K) and (SEQ ID NO: 110 (RG2K); SEQ ID N0:1 12 (RG2L); 
SEQ ID N0:1 14 (RG2M); SEQ ID N0:1 16 (RG2N); SEQ ID NO:l 18 (RG20); SEQ ID 

10 NO:120 (RG2P); SEQ IDNO:122 (RG2Q); SEQ IDNO:124 (RG2S); SEQ IDNO:126 
(RG2T); SEQ ID NO:128 (RG2U); SEQ ID NO:130 (RG2V); and, SEQ ID NO:132 
(RG2W). 

10. The nucleic acid construct of claim 1, wherein the RG3 polypeptide is encoded by 
15 an RG3 polynucleotide sequence. 

1 1 . The nucleic acid construct of claim 10, wherein the RG3 polypeptide is encoded by 
- a polynucleotide sequence as set forth in SEQ ID NO:68. 

20 12. The nucleic acid construct of claim 1, wherein the RG4 polypeptide is encoded by 
an RG4 polynucleotide sequence. 

13. The nucleic acid construct of claim 12, wherein the RG4 polypeptide is encoded by 

a polynucleotide sequence as set forth in SEQ ID NO:69. 

25 

14. The nucleic acid construct of claim 1, wherein the RG5 polypeptide is encoded by 
an RG5 polynucleotide sequence. 

15. The nucleic acid construct of claim 14, wherein the RG5 polypeptide is encoded by 
30 a polynucleotide sequence as set forth in SEQ ID NO: 134. 
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16. The nucleic acid construct of claim 1, wherein the RG7 polypeptide is encoded by - 
an RG7 polynucleotide sequence. 

17. The nucleic acid construct of claim 16, wherein the RG7 polypeptide is encoded by 
a polynucleotide sequence as set forth in SEQ ID NO: 136. 

18. The nucleic acid construct of claim 1, further comprising a promoter operably 
linked to the RG polynucleotide. 

19. The nucleic acid construct of claim 18, wherein the promoter is a plant promoter. 

20. The nucleic acid construct of of claim 19, wherein the plant promoter is a disease 
resistance promoter. 

21. The nucleic acid construct of clahn 19, wherein the plant promoter is a lettuce 
promoter. 

22. The nucleic acid construct of claim 18, wherein the promoter is a constitutive 
promoter. 

23. The nucleic acid construct of claim 18, wherein the promoter is an inducible 
promoter. 

24. The nucleic acid construct of claim 18, wherein the promoter is a tissue-specific 
promoter. 

25. A nucleic acid construct comprising a promoter sequence from an RG gene linked 
to a heterologous polynucleotide. 

26. A transgenic plant comprising a recombinant expression cassette comprising a 
promoter operably linked to an RG polynucleotide. 
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27. The transgenic plant of claim 26, whetein the plant promoter is a plant promoter. 

28. The transgenic plant of claim 26, wherein the plant promoter is a viral promoter. 

5 29. The transgenic plant of claim 26, wherein the plant promoter is a heterologous 
promoter. 

30. The transgenic plant of claim 26, wherein the plant is lettuce. 

10 31. The transgenic plant of claim 26, wherein the RG polynucleotide is selected from 
the group consisting of SEQ ID N0:1 (RGIA). SEQ ID N0:2 (RGIB), SEQ ID NO: 3 
(RGIC), SEQ ID N0:4 (RGID), SEQ ID N0:5 (RGIE). SEQ ID N0:6 (RGIF), SEQ ID 
N0:7 (RGIG), SEQ ID N0:8 (RGIH), SEQ ID N0:9 (RGII), and SEQ ID NO:10 
(RGIJ). 

15 

32. The transgenic plant of claim 26, wherein the RG polynucleotide is selected from 
the group consistmg of SEQ ID N0:21 (RG2A); SEQ ID NO:23 (RG2B); SEQ ID NO:25 
(RG2C); SEQ ID NO:27 (RG2D); SEQ ID NO:29 (RG2E); SEQ ID N0:31 (RG2F); SEQ 
ID NO:33 (RG2G); SEQ ID NO:35 (RG2H); SEQ ID NO:37 (RG2I); SEQ ID NO:39 

20 (RG2J); SEQ ID N0:41 (RG2K); SEQ ID NO:43 (RG2L); SEQ ID NO:45 (RG2M); SEQ 
ID NO:87 (RG2A); SEQ ID NO:89 (RG2B); SEQ ID N0:91 (RG2C); SEQ ID NO:93 
(RG2D) and SEQ ID NO:94 (RG2D); SEQ ID NO:96 ( RG2E); SEQ ID NO:98 (RG2F); 
SEQ ID NO: 100 (RG2G); SEQ ID NO:102 (RG2H); SEQ ID NO:104 (RG2I); SEQ ID 
NO:106 (RG2J) and SEQ IDNO:107 (RG2J); SEQ IDNO:109 (RG2K) and (SEQ ID 

25 NO: 1 1 0 (RG2K); SEQ ID NO: 1 1 2 (RG2L); SEQ ID NO : 1 1 4 (RG2M); SEQ ID NO : 1 1 6 
(RG2N); SEQ ID NO:l 18 (RG20); SEQ ID NO:120 (RG2P); SEQ ID NO:122 (RG2Q); 
SEQ ID NO:124 (RG2S); SEQ ID NO:126 (RG2T); SEQ ID NO:128 (RG2U); SEQ ID 
NO: 1 30 (RG2V); and, SEQ ID NO: 1 32 (RG2W). 

30 33. The transgenic plant of claim 26, wherein the RG polynucleotide is selected from 
the group consisting of SEQ ID NO:68 (RG3) and SEQ ID NO:69 (RG4). 
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34. The transgenic plant of claim 26, wherein the RG polynucleotide comprises a 
sequence as set forth in SEQ ID NO: 134 (RG5). 

35 . The transgenic plant of claim 26, wherein the RG polynucleotide comprises a 
5 sequence as set forth in SEQ ID NO: 136 (RG7). 

36. The transgenic plant of claim 26, wherein the RG polynucleotide encodes an RGl 
polypeptide selected from the group consisting of SEQ ID NO: 11 (RGl A), SEQ ID N0:12 
(RGIB), SEQ ID NO: 13 (RGIC), SEQ ID N0:14 (RGID), SEQ ID N0:15 (RGIE), SEQ 

10 ID NO:16 (RGIF). SEQ ID N0:17 (RGIG), SEQ ID N0:18 (RGIH), SEQ ID N0:19 
(RGII), and SEQ ID NO:20 (RGIJ). 

37. The transgenic plant of claim 26, wherein the RG polynucleotide encodes an RG2 
polypeptide selected from the group consisting of SEQ ID NO:22 and SEQ ID N0:41 

15 (RG2A); SEQ ID NO:24 and SEQ ID NO:42 (RG2B); SEQ ID NO:43 (RG2C); SEQ ID 
NO:44 (RG2D); SEQ ID NO:45 (RG2E); SEQ ID NO:46 (RG2F); SEQ ID NO:47 
(RG2G); SEQ ID NO:48 (RG2H); SEQ ID NO:49 (RG2I); SEQ ID NO:50 (RG2J); SEQ 
ID NQ:51 (RG2K); SEQ ID NO:52 (RG2L); SEQ ID NO:53 (RG2M); SEQ IDNO:88 
(RG2A); SEQ ID NO:90 (RG2B); SEQ ID NO:92 (RG2C); SEQ ID NO:95 (RG2D); SEQ 

20 ID NO:97 (RG2E); SEQ ID NO:99 (RG2F); SEQ ID NO: 1 01 (RG2G); SEQ ID NO: 1 03 

(RG2H); SEQ ID NO: 105 (RG2I); SEQ ID NO: 108 (RG2J); SEQ ID NO:l 1 1 (RG2K); SEQ 
ID NO: 113 (RG2L); SEQ ID N0:1 15 (RG2M); SEQ IDN0:1 17 (RG2N); SEQ ID N0:1 19 
(RG20); SEQ ID N0:121 (RG2P); SEQ ID N0:123 (RG2Q); SEQ ID NO:125 (RG2S); 
SEQ ID NO:127 (RG2T); SEQ ID NO:129 (RG2U); SEQ ID N0:131 (RG2V); and, SEQ ID 

25 NO: 133 (RG2W). 

38. The transgenic plant of claim 26, wherein the RG polynucleotide encodes an RG3 
polypeptide with a sequence as set forth by SEQ ID NO: 138. 

30 39. The transgenic plant of claim 26, wherein the RG polynucleotide encodes an RG4 
polypeptide with a sequence as set forth by SEQ ID NO: 139. 
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40. The transgenic plant of claim 26, wherein the RG polynucleotide encodes' an RG5 ' 
polypeptide with a sequence as set forth by SEQ ID NO:135. 

41 . A method of enhancing disease resistance in a plant, the method comprising 
5 introducing into the plant a recombinant expression cassette comprising a promoter 

functional in the plant and operably linked to an RG polynucleotide sequence. 

42 . The method of claim 4 1 , wherein the plant is a lettuce plant. 

10 43 . The method of claim 4 1 , wherein the RG polynucleotide encodes an RG polypeptide 
selected from the group consisting of SEQ ID NO:22 and SEQ ID N0:41 (RG2A); SEQ 
ID NO:24 and SEQ ID NO:42 (RG2B); SEQ ID NO:43 (RG2C); SEQ ID NO:44 (RG2D); 
SEQ ID NO:45 (RG2E); SEQ ID NO:46 (RG2F); SEQ ID NO:47 (RG2G); SEQ ID 
NO:48 (RG2H); SEQ ID NO:49 (RG2I); SEQ ID NO:50 (RG2J); SEQ ID N0:51 

15 (RG2K); SEQ ID NO:52 (RG2L); SEQ ID NO:53 (RG2M); SEQ ID NO:88 (RG2A); SEQ 
ID NO:90 (RG2B); SEQ ID NO:92 (RG2C); SEQ ID NO:95 (RG2D); SEQ ID NO:97 
(RG2E); SEQ ID NO:99 (RG2F); SEQ ID NO:101 (RG2G); SEQ IDNO:103 (RG2H); SEQ 
IDNO:105 (RG2I); SEQ IDNO:108 (RG2J); SEQIDN0:111 (RG2K); SEQIDN0:113 
(RG2L); SEQ ID N0:1 15 (RG2M); SEQ IDN0:117 (RG2N); SEQ ID N0:1 19 (RG20); 

20 SEQIDN0:121 (RG2P); SEQIDNO:123 (RG2Q); SEQ IDNO:125 (RG2S); SEQID 

NO: 1 27 (RG2T); SEQ ID NO:129 (RG2U); SEQ ID N0:131 (RG2V); and, SEQ ID NO:133 
(RG2W). 

44. The method of claim 4 1 , wherein the RG polynucleotide encodes an RG polypeptide 
25 selected from the group consisting of SEQ ID NO: 138 (RG3); SEQ ID NO: 139 (RG4); and 

SEQ ID NO: 135 (RG5). 

45 . The method of claim 4 1 , wherein the promoter is a tissue-specific promoter or a 
plant disease resistance promoter. 

30 
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46. The method of claim 4 1 , wherein the promoter is a constitutive promoter or an 
inducible promoter. 

47. A method of detecting RG resistance genes in a nucleic acid sample, the method 
comprising: 

contacting the nucleic acid sample with an RG polynucleotide to form a 
hybridization complex; and, 

wherein the formation of the hybridization complex is used to detect the RG 
resistance gene in the nucleic acid sample. 

48. The method of claim 47, wherein the RG polynucleotide is an RGl polynucleotide. 

49. The method of claim 47, wherein the RG polynucleotide is an RG2 polynucleotide. 

50. The method of claim 47, wherein the RG polynucleotide is an RG3 polynucleotide, 
an RG4 polynucleotide, an RG5 polynucleotide or an RG7 polynucleotide, 

51 . The method of claim 47, wherein the RG resistance gene is amplified prior to the 
step of contacting the nucleic acid sample with the RG polynucleotide. 

52. The method of claim 51, where the RG resistance gene is amplified by the 
polymerase chain reaction. 

53. The method of claim 47, wherein the RG polynucleotide is labeled. 

54. An RG polypeptide having at least 60% sequence identity to a polypeptide selected 
from the group consisting of: an RGl polypeptide, an RG2 polypeptide, an RGB 
polypeptide, an RG4 polypeptide, an RG5 polypeptide, and an RG7 polypeptide. 
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